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PREFACE 


Tuts book is offered as a textbook for a course in mathe- 
matics and not as a reference book for the statistician. 
Although a knowledge of the calculus is necessary for the 
proper appreciation of the various principles treated here, 
the applications are carefully restricted for the most part to 
those that do not require the calculus—except, possibly for a 
proper appreciation. All theory of a controversial character 
has been studiously omitted. It should therefore be possible 
for an instructor who is properly informed to employ certain 
parts of the book as a foundation for a course in statistics 
which presupposes only a most elementary knowledge of 
mathematics. The author set out originally to write a text- 
book on statistics in which the more advanced mathematical 
theory was to be relegated to an appendix but he gradually 
became convinced that a textbook worthy of use in a mathe- 
matical course should ‘‘go the whole way” and presuppose 
a knowledge of the calculus. 

The author has departed considerably from the usual 
selection of topics—particularly in including numerical compu- 
tation and finite differences. The only reason why numerical 
computation was included is that it was regarded as all 
important and yet is rarely taught or, at least, properly 
impressed upon the student. Courses in finite differences once 
enjoyed considerable popularity but have all but disappeared 
from the curricula of the colleges of the country. The theory 
is very valuable in statistical work and there seems to be no 
valid reason why it should be omitted here. 
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vi PREFACE 


The main part of the book is given over for the most part 
to work connected with dispersion and this concept, once it 
is introduced, is continually emphasized throughout the rest 
of the book. There are many topics which it was a temptation 
to include but it was felt that the inclusion of some of these 
topics would affect seriously the continuity of the subject 
matter, which the author strived to maintain. 

The author would indeed be lax in gratitude if he failed to 
give all credit for whatever of merit may be found herein to 
his former teachers Professor H. L. Rietz of the University 
of Iowa (formerly of the University of Illinois) and Professor 
J. W. Glover of the University of Michigan, to whom he is 
hopelessly indebted for both instruction and inspiration. He 
hastens, however, to assume for himself all responsibility for 
errors and blunders. 

Last, but not least, thanks are due to the publishers and 
printers for their untiring efforts in the preparation of the book. 


©. H. Forsytu. 
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INTRODUCTION TO THE MATHEMATICAL 
ANALYSIS OF STATISTICS 


CHAPTER I 
ERRORS AND NUMERICAL COMPUTATION 


1. Two Fundamental Facts—Everyone who takes up the 
study of any form of applied mathematics must sooner or later 
come to appreciate at least two fundamental facts, which must 
always be kept in mind if anything like a satisfactory grasp 
of the subject is to be obtained. It is not the author’s purpose 
to enter into a detailed discussion of these two facts, but 
simply to start the student pondering—if these ideas have 
never seriously occurred to him before—until he appreciates 
fully their reality and importance. 

First of all, practically every application of mathematical 
theory, particularly to problems in natural and physical science, 
consists fundamentally in zdealizing the given situation. That 
is, it is rarely, if ever, possible to find a mathematical expression 
or theory which explains or fits exactly the problem under 
discussion when all attending factors and influences are taken 
into consideration. Thus, the familiar mathematical function 
which is customarily associated with the law of falling bodies 
is based upon an idealization of the situation, and corrections 
must be applied to account for the effect of air currents, atmos- 
pheric pressure, etc. In the end the mathematical function 
plus all possible corrections can be expected to fit all experi- 


mental data only approximately. 
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Another fundamental fact which should be fully appre- 
ciated, and which is closely related to the fact mentioned above, 
is that all numerical measurements, or numerical results ob- 
tained by comparing the sizes of objects with a material unit 
used as a standard, are relative in value. It scarcely needs to 
be said that the absolute or exact length of a material object 
can never be determined by direct measurement, but there is 
frequent evidence that this fact is often forgotten or unappre- 
ciated. This fact applies not only to the measurement of 
length but also to such things as descriptions of locations and 
durations of time. The measurer is handicapped still further 
by the fact that the standard of measurement, chosen for its 
relative stability and rigidity, can never be perfect. 

Man is little concerned with the impossibility of determining 
absolute measurements but is vitally concerned with relative 
measurements. No one, for example, spends much thought 
upon his own absolute location, but his location relative to food, 
drink, shelter, etc., probably concerns him more than anything 
else. Fortunately, the accuracy that man needs is also relative 
and so the one thing that is important in making numerical 
measurements is to see that the accuracy of the measurements 
shall be sufficient to meet the needs at the time. 

2. Errors: Absolute and Relative Errors.—It is evident 
from what has been said that all measurements, as defined 
above, and all computations with measurements must involve 
errors of one kind or another. It follows that the absolute 
or exact values of these errors can never be determined; 
otherwise corrections could be made to give absolute measure- 
ments. Nevertheless, the conception of absolute error is useful, 
and so we shall define the absolute error of a measurement 
as the difference between the observed value and the absolute 
value. The absolute error is therefore positive or negative 
according as the observed value is too large or too small. 
We shall define, then, the relative error of a measurement as 
the ratio of the absolute error to the absolute value. 

Although the exact value of the absolute error or of the 


COMPENSATING AND ACCUMULATIVE ERRORS 3 


relative error of a measurement can never be known, it is a very 
important fact that their values can usually be controlled, 
in the sense that limiting values can be determined between 
which the true value must lie. For example, if an object is 
measured by a good foot rule and found to be 8 feet to the 
nearest foot, we know that the error can not be greater than 
1 foot, and the size of the absolute error is thereby controlled. 
The relative error can not be greater than } and its size is also 
evidently controlled. 

3. Compensating and Accumulative Errors.—Errors are 
classified also with respect to their final and combined effect 
when their number in a given investigation is relatively large. 
Errors which tend to compensate or offset each other in the 
long run are called compensating errors. Large commercial 
concerns, which are accustomed to handling a large number of 
small sums of money daily, appreciate the fact that relatively 
trivial errors are usually compensating and tend in the long run 
to offset each other and leave only a small residuary discrep- 
ancy, which would scarcely warrant the expense of time and 
labor needed to investigate each trivial mistake. The term 
“errors’”’ is often used to refer to much more than mere 
numerical mistakes. Thus, if we should tabulate as “ errors ” 
the deviations from the number 5 of the numbers found, say 
in the fifth decimal place of the values given in a table of 
logarithms, we should find that the excesses would tend to be 
offset in the long run by the deficiencies, or that the “ errors ” 
in this case would be compensating errors. It should be 
emphasized that effective compensation of such errors can be 
expected only when their number is relatively large. 

Errors which tend in the long run to accumulate and give a 
relatively large combined error are called accumulative errors. 
Thus, if one should attempt to measure, by means of a defective 
foot rule, the distance between two points located a considerable 
number of feet apart, the errors would probably accumulate 
and give a relatively large error in the total distance 


found. 
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4. The Convention Followed in Expressing Measurements. 
—If we find by measurement that a length is 8.4 inches and 
so express it, we thereby imply that the measurement is to be 
regarded as correct to the nearest tenth of an inch. Such an 
implication accords with a generally adopted convention in 
regard to expressing the numerical results of a measurement, 
which specifies that no more digits shall be written than are 
known to be correct, except whatever zeros may be needed to fill 
up the places of unknown digits immediately to the left of the 
decimal point or spaces immediately to the right of the decimal 
point when all the digits which are known to be correct are to 
the right of the decimal point. Digits so written, which are 
known to be correct, are called significant figures. Thus, 
there are five significant figures in each of the numbers 302.02, 
250.10, 0.0063284 and 2500.0, but only two in the number 
93,000,000 because the zeros are used to fill up the places of 
digits which are unknown. When the distance from the earth 
to the sun is given as 93,000,000 miles the result is to be re- 
garded as correct only to the nearest million of miles; that is 
the exact distance is to be regarded as lying between 92,500,000 
miles and 93,500,000 miles. On the other hand, 2500.0 feet is 
to be regarded as correct to the nearest tenth of a foot and its 
exact value les between 2499.95 feet and 2500.05 feet. 

Sometimes we possess numbers expressing measurements 
which are given with greater accuracy than we care or need 
to use. Thus, suppose that we wish to express a measured 
length of 6.4 inches in terms of centimeters and we find in a 
table of equivalent lengths that 1 inch=2.54001 centimeters. 
It would obviously be absurd to retain all of the significant 
figures in the latter number for the purposes of the problem 
just stated. When one or more digits of a number are dropped 
off, the number is said to be rounded off or simply rounded. 
A number is rounded off by dropping one or more digits at the 
right and, if the digit or digits dropped amount to more than 
one-half of one unit in the final place retained, by increasing 
the digit in that place by unity. Thus, the successive approxi- 
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mations to 7 obtained by rounding off 3.1416, one digit at a 
time, are 3.142, 3.14, 3.1 and 3. 

The convention which has just been described refers par- 
ticularly to numerical results of measurements. A slight 
modification of this convention is usually followed in expressing 
the numerical results of computation with measurements. 
This modification will be explained in the next section. 

5. Computation with Rounded Numbers.—It has already 
been stated that although the exact value of an absolute error 
can never be determined it can be controlled, in the sense that 
certain limits can be placed upon the values that it may have, 
and that such limits can be indicated by following the generally 
accepted convention of expressing the numerical results of 
measurements which we have just explained. It is very 
important that the results of combining the numerical results of 
measurements by addition, subtraction, multiplication and 
division shall also be controlled, or that no illegitimate digits 
shall be retained as significant figures in such final results. 

Two plans will now be offered for controlling the results of 
computation with measurements; one plan will be embodied 
in a set of general rules which may be easily followed and which 
will give results that are reasonably satisfactory in that they 
are probably correct; such rules can usually be employed also 
to show at once what computation may be avoided as useless. 
These rules will be stated in the next sections. 

The other plan should be followed when a definite knowledge 
of the degree of the accuracy of the final result is essential, and 
calls for an analysis of each individual problem; it therefore 
involves a greater amount of labor and care. The plan con- 
sists simply in computing and comparing the maximum and 
minimum possible results, and can be explained best by a con- 
erete illustration. As the maximum possible values of the 
measurements 39.2 inches and 18.3 inches are 39.25 inches (or 
so close to that number that no significant error will be com- 
mitted by assuming it) and 18.35 inches the maximum possible 
product is then 39.25X18.35 or 720.2375. Similarly, the 
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minimum possible product is 39.15 18.25 or 714.4875, and it 
is evident on comparing these two results that if we should 
follow strictly the convention for expressing the numerical 
results of measurements our result when written would have 
only one significant figure and should be written 700. It is 
evident also, however, that if we kept only one significant 
figure considerable information would be ignored, for 717, 
the average of 720 and 714, differs by less than 33 units from 
either the maximum or the minimum possible results. More- 
over, 717 is more probably correct than either of the extreme 
values 720 and 714, and is greatly preferable to 700 since the 
exact value can not be less than 714. It is customary, then, 
in expressing the numerical results of such a computation, to 
include also digits which are probably correct, especially when 
the possible deviation from such a result is small, say when 
the value of the possible deviation can not affect the digit 
preceding the last digit retained in the final result. Thus, 
there would be no justification for retaining four digits in the 
final result, say 717.4, since the possible deviation (2.8 or 
—2.9) would be too great. 

Similarly, the maximum possible value of the quotient 
18.3 39.2 would be 18.35+39.15=0.4687 and the minimum 
possible value of the quotient would be 18.25+ 39.25 =0.4650. 
If the convention stated previously were strictly adhered to, it 
is evident that the quotient should be written 0.47. It is easily 
verified that the probable quotient is 0.467. 

The procedure to be followed to determine the maximum 
and the minimum values of sums and differences is analogous 
and will suggest itself to the student. 

It is rarely necessary to subject each computation to the 
individual analysis illustrated above, but it is very important 
that at least the general rules considered in the following sec- 
tions should be kept in mind and followed. 


ADDITION AND SUBTRACTION if 


EXERCISES 

Show that 

1. The sum of 13.26818, 138.36, 78.423, 7238.4289 and 6.324 
could not be as large as 7474.82 or as small as 7474.79. 

2. The difference between 362.34 and 47.26732 could not be as 
large as 314.09 or as small as 314.06. 

3. The product of 34.68 and 4.6 to three digits could not be greater 
than 161 or less than 158. 

4. The quotient of 36.4232 by 4.6 to two digits could not be 
greater than 8.0 or less than 7.8. 


6. Addition and Subtraction.—The main end to be sought 
in numerical computation is that no more decimal places shall 
be retained in a final result than are correct or probably correct. 
Another goal which is not so important, but which every com- 
puter will naturally appreciate and seek, is the elimination of 
all computation which would probably have no effect upon 
the final result. 

The following theorems are so self-evident that they scarcely 
call for any explanation: the absolute error of the sum of two 
measurements 7s equal to the sum of the absolute errors of the 
measurements; and the absolute error of the difference between 
two measurements is equal to the difference between the absolute 
errors of the measurements. Thus, if the absolute errors of the 
two measurements a and } are A and B respectively, then the 
sum of the two numbers is (a+ 4)+ (6+ B) or (a+b)+(A+B) 
and the error committed in taking a+b as the sum is obviously 
A+B. Likewise, the error committed in taking a—b as the 
difference between the two numbers is evidently A— B. 

Attention is called to the fact that the two theorems stated 
above are practically equivalent; for, since the absolute error 
of a measurement may be either positive or negative, the sum 
(or difference) of the absolute errors of two measurements may 
prove to be a difference (or sum) of their absolute values. In 
any case we have the important corollary: the absolute error of 
the sum or difference of two measurements whose absolute errors 
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occur in different decimal places is equal approximately to the 
absolute error of the less accurate measurement. For, the absolute 
error of the more accurate measurement is relatively insignifi- 
cant. We conclude then that the sum or difference of two 
measurements should not be written to more decimal places than 
are given in the less accurate measurement. To eliminate un- 
necessary computation it would seem desirable at first sight to 
round off the more accurate measurement only to within one 
place of the less accurate measurement before the addition or 
subtraction, but a little consideration will show that the same 
final result will be attained if the more accurate measurement 
is rounded off to the same decimal place as the less accurate 
measurement. As examples, the sum of 138.1 ems. and 
26.032 ems. would be 138.1+26.0=164.1 ems., and the dif- 
ference between the same numbers would be 138.1— 26.0= 112.1. 
It is easily verified that the sum could be as large as 164.18 or 
as small as 164.09, and that the difference could be as large 
as 112.118 or as small as 112.017. 

If several measurements are to be added the measurements 
may be rounded off to within one decimal place of the least 
accurate measurements; the sum should then be rounded off 
one more place. The process is illustrated as follows: 


136.421 ems. 136.4 
28.3 a 28.3 

321 i 321 
68.243 ‘* 68 .2 
Li asee Se Geo 


571.4 Ans.=571 cms. 


It is easily verified that the sum may be as large as 571.99 
or as small as 570.90. It should be evident that the sum of a 
large number of measurements found in this way might differ 
considerably from the true sum if all or most of the absolute 
errors of the measurements should happen to be of the same 
sign. However, the larger the number of measurements the 
more likely it is, in the long run, that the errors will prove to be 
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compensating, and for that reason the sum so found is probably 
correct. 

7. Multiplication and DivisionWe shall now show that 
the rules to be followed in numerical computation with multi- 
plication or division are based upon the idea of relative error. 

If the relative errors of two measurements a and b are a and 
B, respectively, then a close approximation of the exact product 
of the two measurements is (a+aa)(b+b8) =ab+ab(at+B+a8), 
and the relative error committed in taking ab as the product is 
approximately a+f8+a@é. If, in addition, we ignore a as 
relatively insignificant compared with either a or 8 we have the 
theorem: the relative error of the product of two measurements 
ts equal approximately to the sum of the relative errors of the 
measurements. 

Similarly, the relative error committed in taking a/b as 
the quotient of the two measurements is approximately a—B; 
for, the absolute error committed is approximately 


atda a_ a(a—f) 
b+b6 6b 611+ 8)’ 


But ey differs very little from a— and we have the theorem: 


I-78 
the relative error of the quotient of two measurements is equal 
approximately to the difference between the relative errors of the 
measurements. 

Just as the two theorems cited in connection with addition 
and subtraction were found to be practically equivalent, so 
the two theorems given above are practically equivalent and for 
the same reason, namely, that absolute errors, and consequently 
relative errors, may be positive or negative, and the sum (or 
difference) of two relative errors may prove to be a difference 
(or sum) of their absolute values. In any case, we have the 
important corollary: the relative error of the product or quotient 
of two measurements having different numbers of significant 
figures is equal approximately to the relative error of the less 
accurate measurement. It follows then that such a product or 
quotient should not be written to more significant figures than 


10 ERRORS AND NUMERICAL COMPUTATION 


appear in the less accurate measurement. Moreover, unneces- 
sary computation will be avoided if the more accurate measure- 
ment is rounded off to within one of the number of significan? 
figures contained in the less accurate measurement before multi- 
plying or dividing. As examples, the product of 118.321 cms., 
and 12.1 cms., is 118.3 X12.1=1430sq. cms. The quotient of the 
two numbers would be 118.3+12.1=9.78. It is easily verified 
that the product could be as large as 1437 or as small as 1425, and 
that the quotient could be as large as 9.81 or as small as 9.74. 

As the sums, differences, products and quotients obtained by 
the rules given in this and the preceding section usually prove to 
be close approximations of averages of the corresponding max- 
imum and minimum values we shall refer to them as probable 
values. 


EXERCISES 
1. Find the maximum, minimum and probable values of the 


(a) sum of 36.4823, 2.63, 783.4 and 36.488; 
(b) difference between 38.426 and 22.1; 

(c) product of 36.2 and 4.8; 

(d) quotient of 3.64 by 4.6; 

(e) quotient of 6.2 by 38.4. 


2. Find the probable values of the 


(a) sum of 26.834, 182.3, 5284.36 and 3.2648; 
(b) difference between 324.86 and 189.7388; 
(c) product of 836.4 and 0.06; 

(d) product of 26.483 and 0.002; 

(e) quotient of 26.483 by 0.002; 

(f) quotient of 0.002 by 26.483. 


3. The radius of a circle is found by measurement to be 34.6 
inches. What is the circumference? (#=3.1415926536 . . .) 

4. Suppose that the radius of a circle were found by measurement 
to be 2.386274 inches. What is the circumference, correct to the 
nearest one-hundredth of an inch? 
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(Hint: A rapid computation with the first digits of + and of 
the radius will show that the result will have four significant 
figures.) 


5. Express a measured length of 6.4 inches in terms of centimeters. 
(One inch = 2.54001 cms.) 

6. Express a measured length of 34.2 cms. in terms of inches. 

7. The area of a square field is 1286 square feet, correct to the last 
place. What is the length of each side, expressed to the maximum 
number of places? 


CHAPTER II 
FINITE DIFFERENCES 


8. Discrete Values.— Everyone who has had any experience 
in plotting graphs is familiar with the fact that no matter how 
closely two or more points are plotted there are always other 
points which could be plotted between those already plotted— 
that is, if the function is continuous in that interval. In other 
words, there are no ‘‘vacant”’ spaces between two points on 
such a graph. There are many functions, however, which have 
little or no meaning except for particular values—such as 
integral values, of the independent variable. Thus, the 
number of people living at various ages In a given community 
is limited to integral values; fractional values, yor example, 
would have no meaning. Likewise, n? regarded as the formula 
for the sum of any number of terms of the series 1+38+5+7+ 
etc., has no meaning except for integral values of n. Values of 
a variable which are thus restricted are said to be discrete ; 
that is, they are separated by “vacant’’ spaces. 

Any set of numerical results of measurements made by 
comparison with a material standard, or computations with 
such measurements, must necessarily be discrete, because 
absolute values can never be determined. Tabulated values— 
logarithmic, trigonometric, financial, ete.—must therefore be dis- 
crete. In practically all of these cases the corresponding mathe- 
matical expression or function is either unknown or too compli- 
cated for purposes of valuation every time a particular value 
is desired; and therefore advanced mathematical principles are 
employed for computation, or the skill and experience of an 


expert are employed, to experiment and set up tables of these 
12 
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values for ready reference. Since these values must be discrete 
it follows that not all of the values which will be needed can be 
expected to be included, especially if great accuracy is essential. 
The theory of finite differences is singularly appropriate for 
idealizing the law of uniformity of a set of discrete values, 
that is, for ascribing an artificial law of uniformity which can 
be employed readily to find other values not included in the 
table which will be sufficiently approximate to meet the needs. 

9. Definition of a Finite Difference—I{, as is customary 
in the theory of finite differences, we denote a function of 
x by uz (corresponding to f(x) employed in ordinary mathe- 
matical analysis), the finite difference of u,, denoted by the 
symbol Au,, may be defined by the general relation 


AUz= Uz+h— Ur; 


where h is any real constant. 

It will be found possible, however, for all our immediate 
purposes, to adopt the increment h as the unit of measurement 
or, what amounts to the same thing, to assume fh to be unity. 
We shall therefore define the finite difference of wu, by the 
particular relation 


Nie Uaces eee Poe earn 5 (la 


We shall restrict all applications of the finite difference to 
applications of this particular relation. ‘Thus, the finite differ- 
ence of x? is 

Az? = (4+1)?—2? =2¢+1. 

Similarly 

hoe =O en" =a (a= 1), 


and 1 
A log x=log (+1) —log x=log a 


PE 


Second, third and higher differences are merely successive 
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differences of the first and are designated as shown in the 
following examples. Thus, 


Az? = (x+1)?—23 =32?+32-+1, 
A?z3 = A(32?+32+1) =627+6, 
Aja? = 6, 


A‘z? and higher differences of 27 =0. 


EXERCISES 


Find the first, second and third differences of 
thy Geo 
2. v8—4x2+27+1. 
See 
Show that 
4, AC(C=a constant) =0. 
AGE OAties 
. AC .a™t?— C(a™—1)a™*, 
(x+1)8(a+3) 
x(x-+2)8 
. A(uz+vz+w,+ete.) =Auz+ Av,+Aw,+ ete. 
. AUsVz = Vz41AUz+UzAdz. 
Uz  VzAUzg—UgAVz 


5 
6 
7. A’ log «=log 
8 
9 


10. A 


Uz UzUr41 

11, A"x"=n! where n is a positive integer andn!=n(n—1) . . . 
3-2-1. 

12. A”x"=0 where m is the greater of the two positive integers 
m and n. 


13. Prove the identity: uz4:Avrz+v,Auz=dz4 1AUr+UzAdz. 


10. Tabulations of Differences.—It is often desirable to 


employ some form of the following scheme of tabulating 
differences: 
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Us 
Aus 

Us+1 A?u, 
Aues1 Atu; 

Uz+2 Daten 1 he PO Moo Les (A) 
Auz+2 

Uz4+3 

etc., 


where each expression is the difference between the two expres- 
sions immediately to the left—the lower minus the upper. 
If zero is substituted for x in the scheme just given we have 


uo 
Auo 

U1 A?uo 
Au A3uo 

U2 Aut (B) 
Aue 

U3 

etc., 


which is equivalent graphically to a translation of the y-axis 
in scheme (A) x units to the right. As this transformation is 
common in analytic work and often simplifies computation, 
scheme (B) can usually be made to fit any given situation and 
so will prove very useful in much that follows. 

The terms wo, Auo, A?uo, etc., are called the leading term and 
leading differences of uz and form what is called the principal 
diagonal. As an illustration, if u,=2°, 


Uo = 0 
AUp = 1 
Miz I A?up= 6 
INT ol Atup =6 
U2 = 8 A?u, = 12 Atuo =0 
Au2=19 Deut =6 
Vig 2y A?u2=18 
Au3 =37 
ug = 64 


CLC,, 
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which is usually written in a more abbreviated form, such as 


U%,= © AU, 


A2u, 
0 Aeuz 
1 Atuz 
1 6 
7 6 
8 ie 0) 
19 6 
ae 18 
ay 
64 
etc. 


The leading term and differences of x° are then 0, 1, 6, 6, 0, 
etc. 


EXERCISES 


Arrange in tabular form (B) the values and differences of the fol- 
lowing functions: 


1. 3x42. 
2. v72?—7x+12. 
yy 


4, xi1—303+ 402-6242. 

5. How does the tabular arrangement of the function 4* differ 
fundamentally from those of the functions given in the preceding 
exercises? 

6. Tabulate the values log 460=2.662758, log 462 =2.664642, 
log 464=2.666518 and log 466=2.668386 and their differences. 
Would you say that the differences finally vanish absolutely? Explain. 

7. Check the relations 


(a) ee nie 


= A*Ur+ ete., 


n(n—1) 


n 
(b) A Uz=Uztn—MWztn-1+— >—"Uztn-2— ete., 


for n=1, 2 and 3 using the relations shown in table (A). 


RATIONAL INTEGRAL FUNCTIONS ilye 


11. Rational Integral Functions.—A rational integral func- 
tion is a function that can be written in the form 


ax"-+ba"-1+-cx"-?+4 ete., 


where the coefficients are real and the exponents are positive 

integers. Thus, x?+42?—6x+2 is a rational integral function. 
Since each difference of x”, where n is a positive integer, 

lowers the degree of that function by unity, and since 


A(ustvz+wztetc.) =Auz+Avz+Aw,+ etc., 
and A(constant) =0, 


it follows that each difference of a rational integral function 
lowers the degree of the function by unity and that, after a 
while, differences are finally reached which are zero. This fact 
distinguishes rational integral functions from all other functions. 
It is left for the student to show that the third difference of 
g?+47?—62+2 is constant and that higher differences are 
therefore zero. 

12. Factorials——If we define the expressions «™ and 2‘-” 


by the relations 
2™ =2(2—1)(a—2) ... (a -—n+1), 


aos 1 
® ie GEG) ... (4t+n—-1)’ 


it is easily verified that 


Dae i Cm) Me Mt ss gee 6) 5 ae eC 2) 


and 
ee) a peta game OH ee Be 8 ea = (8) 


Thus, 
Az® = Aa(x—1)(x2—2) =3¢® =32(x—1). 
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Expressions of the form x” and x” are called factorials 
in the theory of finite differences and will be found to play a 
very important part in the work of the succeeding pages. 
Such factorials should not be confused with the factorial 
ni!=n(n—1)(n—2)...3-2-1, common in ordinary mathe- 
matical analysis. 


EXERCISES 
1. Verify formula (2). 
2. Verify formula (8). 


3. Given ,C, (the number of different combinations of xz things 
a(~—1)(a—2) ... (w@—r+] 
eS Lass ) find the first, 


taken r at a time)= 


second and r-th differences and express the results in terms of the 
symbol .C,. 


If we define uz 


and u,‘~™ by the relations 
tha titan imc aad; 


1 


uz. = : 
Urr+ Ue 2.. . Urtn—1 


show that 


4. Ati, = (Ue+1—Us—np Ue”. 


5. Atis’—™ = (te—Usa nus’ *-», 
6. A(ax+b)™ =an(ar+b)%-», 

Check by using the result given in Exercise 4. 
7. A(az+b)‘—™ = —an(ar+b)-"-», 

Check by using the result given in Exercise 5. 
8. Show that (r7+h)~™ =(a+k+n—-1), 

and (e+k)~-— =(e+k+n—1)™, 
9. Find A(@+k)-™ and a(a+k)-~™, 
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13. Newton’s Formula. 


Since U1 =Uot+Auo, 
U2=U1 +Au, = (uo +Auo) +- (Aup+A?uo) 
=Up+2Aup+A*uo, 


U3 = Up +d3AU9p+3A2uU9p+A® uo, 
CWOr, 


we have, in general 


=I == = 
g= tot 2b +2 V2 +2 ue ae etc., 6 (4) 


where, it is to be noted, the coefficients follow the binomial law. 
Expansion (4) is called Newton’s formula. 

Newton’s formula is so important that we shall derive it in 
another and more rigorous manner, as follows. Let us con- 
sider those functions which it is reasonable to assume can be 
expanded in the form 


uz,=atbse+cx?+de™-+ ete. . . . . (C) 
Then, evidently 
a=uUo. 
Moreover 
Au,=b+2cr+3dr-+ ete., 
and 
b=Auo 
Similarly, 
A2uo qua wo 
ae pacar tae 


Substituting these expressions for the coefficients in equa- 
tion (C), we obtain Newton’s formula in the form 


A2 AS ? 
tte= tio cAuo 2+ 2+ Lerma pena 4.) 
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It is possible to expand most functions by Newton’s formula, 
but rational integral functions are unique in that their expan- 
sions always involve only a finite number of terms. As an 
illustration, let us expand x. As the leading term and dif- 
ferences were found previously to be 0, 1, 6, 6, 0, ete., these 
values may be substituted in Newton’s formula to give 


e=2e+3709+e9 =2+32(x—1)+2(x—1)(x—2). 


It is easily verified that the expression on the right can be 
reduced to 2°. 


The general term of a sequence (or of a series), such as 
OO Os Onsen 


can be determined in like manner, provided the general term 
is rational integral. Thus, if the terms of the series 


1+2+5+10+ ete., 


are differenced, the leading term and differences are found to be 
1, 1, 2, 0, ete., and if these values are substituted in Newton’s 
formula we obtain the general term 1+2+2®@ or 1+2?. Can 
the general term of the series, 1+3-+3?+3%-+ ete., be deter- 
mined in this manner? Explain. 

14. The General Term of a Sequence or of a Series.—In 
ordinary mathematical analysis we usually understand the 
general term of a sequence or of a series to be the expression 
which gives the value of that term when the number of that 
term is substituted in the expression. In the theory of finite 
differences it is usually more natural to assume that a series 
is of the form 


Uotu+uztust+ ete. 


or that the general term will give the first term for o= 0, the 
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second term for r=1, ete. According to that assumption the 
general term of the series 1+8+27+64-+ ete., would be (x+1)3 
and not 2°. 

Although, as will be shown later, any form of general term is 
permissible in the theory of finite differences, it will be well 
for the student to adopt the form suggested above until he 
becomes familiar with the modifications necessary when any 
other form is employed. 


EXERCISES 


Find the general term of the following series having w% as the first 
term: 
. 145+9+138+17-+ ete. (Check your results.) 
. 1+4+9-+16-+ ete. 
. 1+04+14+4+16-+4 ete. 
. 8+872+33+ 34+ ete. 
. 8+16+4+382+64+ ete. 
5 De OE INSEL aie. Ans. 2+52+2%+427®, 
Expand the following functions by Newton’s formula: 
7 2+. 
Bh SPA 
9. e3ta2tae+1. 
10. Expand x?+2+1 by the formula 


oon rr WN 


INU sae 


ae | 


== 9 tk k—2 


Ug =U—K+ (a+k) du_z_1+ (4 +h) 
te., for k=1. Ans. 1 —2(a--1)--(e- 1)(a--2): 


15. Finite Integration: Definite Integrals.—Finite integra- 
tion may be defined as the inverse of finite differencing; thus, 
since the finite difference of x? is 2a+1, the finite integral of 
2Qxn-+1isx?. Since, however, A(x?-+C), where C is any constant, 
is also 22+1, the finite integral of 27-+1, written 2(2x+1) 
should be written 

2 (2¢-Fl)iea--C. 
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The constant C is called the constant of integration. Since, 
in general, the value of the constant of integration is unknown, 
such integrals are called indefinite integrals. 

If each of two constants is substituted for the variable in an 
indefinite integral and the difference between the two results 
taken, the constant of integration is eliminated. Such a dif- 
ference is called a definite integral and the constants which are 
substituted are called emits. Thus, the definite integral of 
2x+1 for the limits 1 and 4 is 


4 (Qe+1) =(«2+C)* =16—-1=15. 


In this case the “‘ 4’ is called the upper limit and the ‘ 1” 
the lower limit. What would be the effect of interchanging the 
limits? 

It is easy to verify the following fundamental formulas by 
differencing the expressions on the right: 


) (n+1) 
Ze se =) Ot « AB) 
(—n+1) 
>> (-n 7% 
2x ae A RE - « 66) 
is qrtt+d 
Zhams+? =k = -+C. (7) 


16. Summation of Series.—I'rom the following scheme 


UO 

Auo 
Uy 

Aut 
U2 

Aug 
Un—1 

AUn—1 


tty 
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it is evident that 
n-1 
Auo+AujtAue+ ...+Au,_-1= = Au, = 2% Auz 
z=0 


=(uztC))=Un—uo. . (8) 


It is very important that the distinction between the two 


symbols 
n—-1 
Au, and DAu,, 
z=0 


as used here, be well understood. The former means simply 
the sum of all terms of the form Au, from x=0 to r=n—1 
inclusive, while the latter means the definite integral of Au, 
between the limits 0 and n. 

Since u,+C is the integral of Au,;, relations (8) show that 
the sum of a finite number of terms of a given series is given by the 
definite integral of the general term. The lower limit is the 
same as the value of x corresponding to the first term, while 
the upper limit is one unit greater than the value of x cor- 
responding to the last term. Thus, the sum of the first n 
terms of the series 1+3+5+7-+ etc., is 


"3 (Q¢-+1) = 2% (2e+1) = (a2+C)2 =n2. 
z=0 


It is also easily shown that 
n—1 
Nigam NG eet lin = Up a ees ee) 
z=k 

Relation (9) is to be used when the form of the general term 
has not been selected in accordance with the suggestion made 
previously—that wo be the first term. Thus, the sum of the 

first n terms of the series considered above is given also by 


(24-1) = =%t+(22-1), 


1 


ie 


and also by a 
Py Cz—3) =24t*Q2—3), 
r=2 
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It is suggested that, even though the sum of a specific num- 
ber of terms of a series is to be determined, it is well to deter- 
mine first the sum of a general number, say n, of the terms, in 
order that a check upon the work may be obtained; the specific 
number can then be substituted for n. The check is made by 
substituting 1, 2, ete., in turn for n in the general result found 
and comparing the successive results with the actual sum of 
one, two, etc., terms. 

17. Series Whose General Terms are Rational Integral 
Functions.—It has already been shown that if the general 
term of a series is rational integral it may be easily determined 
by Newton’s formula. Moreover, the use of Newton’s formula 
insures that the general term so found shall be expressed in a 
form which can be easily integrated. Thus, for example, even 
though the general term of a series is known to be x?+2-+1, it 
would be highly inconvenient or difficult to determine the 
finite integral directly, but if the first few terms of the series 
1+3+7+138-+ ete., were differenced and the leading term and 
differences were substituted in Newton’s formula we obtain 
the general term in the form 1+2x+2 which can be integrated 
term by term. It is easily verified that the sum of the first n 
terms of the series is }(n?+2n). 

If only the series is given, the general term may be found, 
of course, in exactly the same way, the general term being 
assumed to be rational integral if higher differences are found 
to vanish. 


EXERCISES 
Find the sum of the first n terms of the following series: 


1 14+38+9+19+33+51-+ ete. 

2. —1+0+7+26+63+124+ ete. 
3. 1+38+7+4+13421+31- ete. 

4. 1+84+7+138+214+4 ete. 

6. 1.384+2.4+3.5+4.6-4 ete. 
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6. (a) 14+3+5+4+7- ete. 
(6) 14+8+5+7+10+16+28-+ ete. 
Explain the difference between these two examples. 

7. 2+6+18+54+4 162+ ete., using 2-37~! as the general term. 
8. 2+4+8+16+432-+ etc., using uw. as the first term. 
9. 3+6412+24+48-4 ete. 
10. 34+3°+3°+37+4 etc., using 3” as the general term. 
11. 1+2+5+4+10+17- etc., with wu, as the first term. 
12. 2-44+4-6+6-8+8-10+4 etc., in two ways. 

(Suggestion: the general term may be written (27+4)™.) 
13. 1-24+2-3+3-4+4-5-+ etc., in two ways. (uz=(x+2)®.) 
14. 1-2-342-3-443-4-5+ etce., in two ways. 
15. 1-34+2-443-5+4.-6-+ ete. Pe : 
16. Show by actual summation that both 4 (2x+1) and 2(24—1) 


will give the sum of n terms of the series 1+3+5+7-+ ete. 
n—1 n 
17. Show by actual summation that both 2 («+1)% and Dz? will 
0 1 


give the sum of n terms of the series 1+8-+27+64+ 125+ ete. 


18. Series Whose General Terms Are Not Rational In- 
tegral.—Series whose general terms are rational integral occur 
quite frequently in practice but constitute only one of the 
large number of types which should be considered in a com- 
plete discussion of the subject of summation of series. As a 
complete treatment of the subject is clearly beyond the scope 
of this book, we shall restrict further discussion to a considera- 
tion of a few of the most important types. The general terms 
of these types of series will, in general, be determined by mere 
inspection. 

Geometric series or series whose general terms are of the 
form ka™ are readily summed by formula (7). Thus, the sum 
of the first n terms of the series 

949.319.329.384 ete., 
is : 
D 2-37 = (3)5 <3" 1. 


fee 
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Many series can be summed by employing the formula 
known as the formula for integration by parts. Thus, since 


AUVz= Vz41AUztUzAdz, 


Du,Ave=UWe—Dvzp1AuztC. . . . . (10) 


Formula (10) is especially useful in summing various types 
of series whose general terms consist only in part of rational 
integral functions. In applying the formula it is customary 
to choose for uz that part of the general term which is rational 
integral, so that it will appear either as a constant or with 
lower degree in the integral on the right. In the latter case the 
application of the formula must be repeated until the rational 
integral portion does appear as a constant on the right. The 
part chosen for Av, must, of course, be integrable. As an 
illustration, let us determine the sum of the first n terms of the 


series 
1-1+3-3+5-3?+7-33+ etc., 


whose general term is evidently (2x+1)37. 
Letting uz=2x+1 and Av,=3*, and applying formula (10), 
¥(2xe+1)3* = (2e+1)37— D374+0 


= (Qa+1)37—371+40, 

Substituting the limits 0 and n and subtracting, the sum of 
the first n terms reduces to 3"(n—1)+1. 

It is well to emphasize the fact that the subscript of v, 
changes from x to x+1 each time formula (10) is applied. 

Occasionally a general term is met which can be integrated 
either by formula (6) or by one very similar to it, but it is very 
important to note that in such a case the degree of the de- 
nominator must always exceed the degree of the numerator by 
at least 2. The difficulty otherwise encountered probably con- 
stitutes the greatest obstacle met in summing series by finite 
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integration. As an example illustrating the possible difficulty, 
let us attempt to determine the sum of the first n terms of the 
series 


14344444 ete. 


The general term is 1/r+1 or (+1). According to 
Exercise 7, p. 18. 


(a1)e 
ae 


S(a+1)-) = +C, 


which is meaningless. Interesting attempts have been made to 
overcome such difficulties, and interesting results have been 
attained, but as the results are of little practical value for 
present purposes further consideration of the problem must 
be omitted here. 

One type of a rational function (defined as the quotient of 
two rational integral functions) has a general term of the form 
f(x)(@@+k)-—™ or more rarely f(x)(ax+b)-™ where f(x) is 
rational integral. In either case it is only necessary to expand 
the numerator into a series with terms of the same form as 
those appearing in the denominator, and then to break up the 
general term so expressed into a sum of several expressions. 
The general expression for the numerator can be determined 
by inspection or by Newton’s formula, and its expansion can 
usually be made by inspection. As an example, the general 
term of the series 


1 3 5 7 
pe Le eS RAO ae 
is evidently 
2a4+1 
es) 
Cee ae ta iad Compe ICE YY 


and the numerator 27+1 can be written by inspection 
—1+2(z+1). (See Exercise 10, p. 21.) Hence, the general 
term can be broken up into two parts —(#+1)~9+2(¢+2)~. 
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It is easily verified that the sum of n terms of the series is, 


1 soars 
An+))(n+2) n+2'4 


(See Exercise 7, p. 18.) 


EXERCISES 


Find the sum of n terms of the series: 

1. —1-142-245.-224+8-234 ...+(3r—1)27-+ etc. 

. 1-142-34+5-324+10-334+ ... +(7?+1)3’, ete. 

. Whose general term is (2a-+3)(a+2)~®. 

. Whose general term is (83x+4)(x—1)~®. 
2¢e+1 

(+1) (e+3)(@+4)’ 


tor and denominator by x+2. 


o Bm © bd 


. Whose general term is Multiply numera- 
22+2 
(8x+1)(82+4)(32+7) 
(ax--b)@+D 
a(n+1) 


Use the formula 


6. Whose general term is 


Z(ax+b)™ = with n negative. 


MISCELLANEOUS EXERCISES 


Determine the sum of the first n terms of the following series: 
. 144494164254 .... Ans. $(2n8+8n?+n). 
» 1+5+114+194+29+ .... 

1+8+27+644+125+ .... 

1+5+52?+53+544+ .... 

68+-64+-65+ 68+ .... 

3+8+15+24+35+ .... 

» 2+12+86+80+150+ .... 

» 44+18+484100+180+ .... 

1-1-+42?-2+32.22+42.934 |... 

» 1-2422-22432.23442.941 ||. 

» 1-1438-5+5-52+7-53+9-544 2. 

» 1-34+4-32+-9 -334-16-34125-35+ ..., 


PHONARarope 


aa 
nro 
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cs cred te et ened! 
13. - 
3 TOS aaa 


5 
pa Ps Ng eee Dc 
ae rue 6: 7 6.74 7: ca 


a rao ‘s eh 


16... more 
1g3: Tehipe 1 Ba a ae 9. wee 


17. (a) 1+8+6+10+154+.... 
(6) 14+3+6+10+15+22+33+51+ .... 
Compare the results obtained in (a) and (6). Explain the nature 
of the assumptions underlying each problem. 
18. 0+0+0+6+24+60+120+ .... 
(a) using the first three terms. 
(6) omitting the first three terms. Compare the results and 
explain the difference. 


19. Algebraic Treatment of Symbols.—Let us now refer back 
to some of the operations of the finite calculus and introduce 
the derivative to arrive at an interesting formula. 

If we define the operator E by the relation 


Hur =Uz+1) 
so that 
Ei Usa, 
then, since 


1 
A"Ug = Usin— Viet oe veh 2— etc. (see Ex. 7b, p 16), 


we may write 
Mu.= (En nH ee 1) Fn 2 ete.) 
where all the terms within the parenthesis are to be considered 


1The general expression for the numerator can be obtained by New- 
ton’s formula. 
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as operating upon uz; or, expressing the relation more com- 


actly, 
oe cha A"u,= (#—1)"u-, 
or 
A"=(H—1)", 
and 
A=B=1. 2. a oa geen 


It is left for the student to show that Newton’s formula 
can be expressed symbolically, 


E*u,=(1+A)*u, (see Ex. 7a, p. 16), 
or 
E=1-+4A, 


which agrees with (11). The preceding is a good illustration 
of the possibilities of treating various operations as algebraic 
operations. Although such a procedure calls for special 
interpretation and a certain amount of check in some cases, 
as we shall find, it frequently enables one to arrive at results 
with much less labor than the ordinary procedure would 
require. 

Another illustration of the algebraic treatment of symbols, 
which is useful, is as follows: 

Taylor’s expansion of the ordinary calculus may be written 


Hlth) =f(2)+hD. f(a) +5 D2f(e) +E Dsfla) + ete, 


where D, is employed as the symbol for the derivative. 
If we let h=1 and use the notation peculiar to finite differ- 


ences 


2 3 
Uz+1>= Eu, = tart Datla ep Das 


+ ete. 


= (4+ DASE, ete.)uz 


=ePzu, (see the expansion of e* given below) 
or 
Ei == G00, coe. ©. eee mT 
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Thus, H, A, and e?= are seen to be connected symbolically 
by the relations 
Ya oy 6) A ee (13) 


20. Bernoulli’s Numbers.—The coefficients of the terms of 
the expansion 


x x at x6 
7 = Bo Bix t Bos — Bay t Boe etc., “4 (14) 


er 


or Bo, Bi, Bz, etc., are known as Bernoulli's Numbers. The 
expansion is easily obtained by ordinary division, where e* is 
replaced by the series 


qr? 3 at 


The values of a few of the numbers are as follows: 


Bo=1 Ba=35 
Bi=% Bo= as 
B2=1 etc. 


All the numbers corresponding to odd subscripts are zero, 
except Bi; this fact can be verified by inspection of the follow- 
ing identity, 

Cee ee | 
Pes ome eae 


for the left side is the expansion (14) minus the second term, 
and if the sign of x is changed on the right side the expression 
remains unaffected, showing that it is an even function or that 
(14) contains no odd powers after the second term. 

21. Summation of Series.—The expansion (14) given in the 
preceding section proves to be well adapted, with certain 
interpretations, to summation of series. It has been employed 
to sum series of certain troublesome forms to give approximate 
results, but we shall have to omit the consideration of these 
special forms here. 
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Since A=ePz—1, 
Us 
2ue= ATMs = (eP2— 1) le = 


abe D3uz 
—B, 41 +... 


D 
= BoD,“u,— Biuz+ Bez 2 
where, it will be noted, the various powers of D; are inter- 
preted as orders of differentiation. If, in addition, we inter- 
pret D,—1u, as the inverse of differentiation, or as integration, 
we obtain finally 
Da. Dv, , DAu 


a is. il ~ = 
Bue=C+ f uae 9 4s 12 720 T3940 


etc. . (15) 


As an application of formula (15) let us find the sum of the 
first n terms of the series 0+ 14+ 24+ 34+ ete. 
It will be helpful to verify the following results: 


n—1 4 D vt D 34 n 
St tele PS Pees Nelo Saberd BL 
ne (c+ Neca Ma Fre . 


ae oe a 


ai tow Vas was, 
Verify the result for the sum of 1, 2 and 3 terms respect- 
ively. 


EXERCISES 
Find the sum of the first » terms of the series whose general terms 
are: 
1. x+3. (Check your results.) 
2. £3-+-2, 
3, 2-1, 


CHAPTER III 
INTERPOLATION 


22. Interpolation——Even though the functional relation 
between two or more variables is given, it may prove very 
troublesome to evaluate the function for given values of the 
independent variables. Take, for example, the equation 
y=logiov. Although it is not particularly difficult to compute 
the logarithm of a given number, say by the use of an infinite 
series, such a method requires too much labor and time to be 
employed in an ordinary application of logarithms. The 
method ordinarily employed is to have a table of the most 
frequently desired values of such a function and then to obtain 
other values by proportion. The observing student will note 
that such an assumption amounts to an assumption that the 
graph of the function is a straight line. Although such an 
assumption is clearly unjustified with respect to the graph of 
any function taken as a whole, it proves quite reasonable in 
many cases when applied to the values of the function within 
a relatively small interval, and values so obtained often prove 
sufficiently accurate to mect the needs at the time; the process 
then avoids much labor in computation. The scheme just 
considered is familiar to the student and is already known to 
him as interpolation. We shall use the term interpolation in 
the same connection; but we shall extend the application of 
the scheme to cases where the values of a function within a 
given interval are assumed to satisfy a more general function, 
namely, a rational integral function or polynomial of the form 
y=a+bx+cxr?+ etc., sufficiently accurately to meet the imme- 
diate needs. As we have already found—by differencing—there 
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are many functions which will not justify such an assumption at 
all; on the other hand, there is a surprisingly large number of 
functions which lend themselves much more satisfactorily to 
the plan we propose to consider than the nature of the function 
itself would probably indicate. The final test as to whether 
the plan is feasible in any particular case will lie in the relative 
sizes of the successive differences of the given values. This 
fact extends the usefulness of the scheme, because in a great 
many cases only the numerical values will be given and nothing 
of the nature of the fundamental function will be known. 

We shall restrict our attention to the interpolation of values 
of functions of a single variable. 

23. Newton’s Formula.—Newton’s formula proves admir- 
ably suited for interpolation from the point of view of simplic- 
ity and ease in computation. Primarily, Newton’s formula 
represents a curve which passes through the points (0, wo), 
(1, wi), ete., whose equation is rational integral and in general 
of the (n—1)th degree if there are n points. As a concrete 
illustration, the output of steel in the United States in hun- 
dreds of thousands of tons for the four specified years was: 


1890 wo= 48 


Auo=18 
1895 w= 61 A?up = 23 
41 A8uo = 34 
1900 w2=102 57 
98 


1905 w3=200 


Hence, the estimated output for, say, year 1897 would be 
uz, obtained from Newton’s formula in the form 


te=43-+182+ 23" 1), 342@ ene et 


where the interpolation is restricted to third differences, or 
uy =43418-74-23-4% 34.7, 
= 72.736 or 73. 
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What would have been the result of restricting the work to 
second differences? To first differences? Restricting the use 
of Newton’s formula to first, second, third, etc., differences 
amounts obviously to passing curves of the first (straight line), 
second (parabola), third, etc., degree respectively through 
two, three, four, etc., points respectively. 

It should be emphasized that wo can be assigned at will to 
whatever given value we please, and wy, we, etc., in accordance 
with this assignment; once, however, the assignment is made, 
the abscissa of the ordinate or value to be interpolated is 
uniquely but easily determined by inspection; moreover, the 
use of Newton’s formula in the form given above (that is, in 
terms of uo and its differences) requires that the first value 
given shall be wo or the value corresponding to «=0. 

The most prominent statisticians have come to agree that 
third or fourth differences are usually sufficient when dealing 
with ordinary statistical data; this fact is kept in mind in 
the treatment and the illustrations given in the following 
pages. 

Attention is called to the remarkable fact that, although 
the logarithmic function is irrational by nature and therefore 
has no order of differences that vanishes absolutely, differences 
of numerical values of the function taken from a reasonably 
small interval converge so rapidly in practice that it usually 
requires few differences to interpolate with a remarkable 
degree of accuracy. As an example, suppose that we have 
given the following values: 


log 2.7182=0.4342814081 
log 2.7183 =0.4342973851 
log 2.7184=0.4343133615 
log 2.7185 =0.4343293373 


log 2.7186 =0.4343453126 
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If we difference these values and employ Newton’s formula, 
we obtain 


log e=0.4342814081+0.8182846(159770-10~7°) 


0.82(0.18) 
2 


+ (6- 10-1) 


=(0.4342944819 (correct to the last place) 


where e=2.71828182846. The values of x and 1—z are 
reduced to two decimals in multiplying the second difference 
because of the relative smallness of the latter. 

24. Lagrange’s Formula: Central Difference Formulas.— 
If the series of given values are not “ equidistant ’’ a convenient 
formula to be used is that known as Lagrange’s formula 
(although the formula is applicable also when the values are 
equidistant). This formula, however, is not based upon 
finite differences and proves rather cumbersome in application; 
its selection, therefore, is usually a matter of necessity rather 
than of preference. If the n values f(a), f(b), f(c), . . . f(k) are 
given, we assume a rational integral function of the (n—1)th 
degree of the following form: 


f(x) =A(x—b)(a—c) ... (w—k) + B(a@—a)(x—) ... (ex—k) + ette., 


where a different one of the factors (e—a), (7—b), ete., is missing 
in each product. 
Substituting «=a we obtain 


Z S(@ 
A (a—b)(a—c) ... (a—k)’ 
likewise, for «=b 


‘ f(b) 
B (6—a)(b—c) ... (6—k)’ 


and so on for other coefficients. Substituting these expres- 
sions in the equation assumed originally, we obtain 
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ee (x—b)(x—c) ... (e—k) 
Ee GO GB 


(x—a)(w—c) ... (x—k) 


Gabo = bh) 


+t ete. . (16) 


which is known as Lagrange’s formula. 

As an illustration, suppose that we select the values of the 
output of steel for the years 1890, 1900 and 1905 given in the 
preceding section; then we may let a=0, b=2, and c=3, and 
f(0) =48, f(2) =102 and f(3)=200; formula (16) then becomes 


(x—2)(x—3) 
(2-2) (—3) 


It is easily verified that if we substitute 7/5 for x in this 
equation we obtain 65 as the estimated output for 1897. It is 
left for the student to show that if formula (16) were applied 
to all the data of the original problem the result would be the 
same as that obtained previously. Why must the result be 
the same? 

Sometimes the data are known to be very accurate and 
accuracy in the final results of interpolation is very essential. 
Under such circumstances it is well to consider a certain fault 
in Newton’s formula and employ a suitable modification of it. 
Newton’s formula is expressed, of course, in terms of wo and 
its differences; since wo is an “end” value it receives more 
emphasis than it deserves. Slightly better results would be 
obtained if the interpolations were made in a central interval 
in terms of values located on both sides of the interpolated 
values, without giving too much emphasis to the values on any 
one side. Formulas have been derived with this idea in mind 
and are called central difference formulas; all can be derived by 
simple modifications of Newton’s formula, but as they would 
be valuable only in connection with data which would be more 
accurate than ordinary statistical data we shall omit further 
consideration of them here. The student who is contemplat- 


x(a—3) 
(2)(2—3) 


x(x—2) 


+102 Bay 


+ 200 


F(x) =48 
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ing work of a more refined character should, however, keep 
them in mind. 


EXERCISES 


1. Find the logarithms of the following numbers (using the tables 
given in the back of this book): 


(a) 93.4632 (The final results are likely to prove a 

(b) 0.632481 unit or so in error in the last place. At 

(c) 0.00128734 what part of the table are the results 
more likely to be in error?) 


2. Find the antilogarithms of the following logarithms (using the 
tables given in this book): 


(a) 2.834672. 
(b) 4.164872. 
(c) 8.426838—10. 


3. Inverse interpolation consists of solving backward for x. Use 
Newton’s formula and find the antilogarithm of the following numbers, 
using the values given in the table of logarithms: 


(a) 2.368427, using first differences only; 
(b) 2.368427, using second differences and solving a quad- 
ratic equation. 
4. Find the logarithm of 132 from the logarithms of 131, 133 and 


135 (using the tables given in this book): 


(a) by Lagrange’s formula; 
(b) by Newton’s formula. 


5. Test the following sets of values as to whether satisfactory 
interpolation by Newton’s formula could be expected: 


(a) The amounts to which $1 would accumulate at com- 
pound interest at 2 per cent in 16, 17, ete., years: 
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Years Amount 
16 ASM Zoo: 
197/ 1.40024142 
18 1. 42824625 
19 1.45681117 
20 1.48594740 


(b) The reciprocals of the numbers: 


7651 = .0001307019 ... 
7652 = .0001306849 ... 
7653 = .0001306677 ... 
7654 = .0001306506 . 
7655 = .0001306336 . . 


6. The pressure of wind in pounds per square foot corresponding 
to the velocity in miles per hour has been determined by experiment 
to be approximately as follows: 


Velocity Pressure 
15 1.1 
20 270) 
30 4.4 
40 7.9 


Estimate the pressure for a velocity of 25 miles per hour. 


7. Given log 71=1.8512583 
log 72=1.8573325 
log 73 =1.8633229 
log 74=1.8692317 


find log 71.54. Find also log 0.07154. 


8. Given log sin 12 39’=9. 34043382 —10 
log sin 12 40’=9.34099630—10 
log sin 12 41’=9.34155802—10 
log sin 12 42’=9 .34211897—10 
log sin 12 43’=9.34267917—10 


40 


find log sin 12 40’.4134. 
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(The computation may be simplified by 


rounding the values of 2, —1, etc., consistent with the relative values 


of the differences.) 


9. Given log cos 27 36/= 
log cos 27 37’= 
log cos 27 38’= 
log cos 27 39’= 


log cos 27 40’= 


find log cos 27 38’.3. 


10. Given log 472=2.67 
log 473 =2.67 
log 474=2.675 
log 475 =2.676 
log 476=2.677 


find log 472882. 


Ans. 9.34122861 —10. 


9.94753350—10 . 
9 .94746743 — 10 
9 .94740132—10 
9. 94733516 —10 
9 .94726895 — 10 
Ans. 9.94738147—10. 


11. Given the death rates per 100,000 population in the registra- 
tion area of the United States by years for the following diseases: 


Typhoid 
Sled 


1906 


o 
1909 2 
1912 16.5 
1915 12.4 


Tuberculosis Cancer 
157.1 69.1 
139.3 73.8 
129.8 FACE AG, 
L207 81.1 


estimate the death rates for the year 1910 by Newton’s formula. 


25. Leading-difference Formulas.—If any formula based 
upon finite differences is to be employed to interpolate several 
values in the same interval, the work can be simplified and 
systematized somewhat further by formulas for the leading 


differences of these interpolated values. 


As an illustration we 


shall show how Newton’s formula can be so applied to interpo- 


late t—1 values in a central interval. 


If we difference the 
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values of wz, w41, etc., obtained from Newton’s formula (to 
t t 


third differences) three times, the leading differences for interpo- 
lating t—1 values by third differences between wu; and u2 become 


_ Auo t+1 A?uU09 2-1 Aeuo 
ae aro aye 6 B 
A2 A3 
(2)= ia + 1 (17) 
Aeu 
(4) = (5) ="etc,, =0 
For t=5 these leading differences become 
(1) = .2Auo+ .12A?u0— . 032A? U9 
(2) = 04A2u9+ .008A3u9. . . (17’) 


(3) = 008A x0 


As an example, suppose we wish to interpolate four equi- 
distant values between uw; and we of the following hypothetical 
set of values: 


uo =416 
138 
uy = 554 — 34 
104 8 
U2 = 658 — 26 
78 
ug = 736 


Then by (17’) 
(1) = .2(138) —3(.04)34—4(.008)8 = 23 .264 


(2) = —1.296 
(3) =0.064 
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The differences just found are then added accumulatively 
to ui = 554, as follows, to give the desired interpolations: 


u, = 554.000 


(1) = 23.264 
577 .264 (2) = —1.296 
21.968 (3) = .064 
599 .232 =—1, 939 
20.736 064 
619.968 —1.168 
19.568 064 
639.536 —1.104 
18.464 


uz = 658 .000 


Probably the main advantage to be gained by the use of 
such formulas is the check upon the work; for not only are 
the t—1 (4 in the example) values interpolated, say between w1 
and uz, but the “end” value wz is also reproduced—if no 
error is made in the computation. The work was carried to 
the last decimal place in the above example to show this check; 
it would be unnecessary to retain all the decimal places in 
practice. 

26. Tangential Interpolation.—If the method of interpolat- 
ing several values in an interval, explained in the preceding 
section, is applied to several succeeding intervals, the interpo- 
lation curve passing through the values interpolated between, 
say, yi and yz will not in general be continuous with the curve 
passing through the values interpolated between ye and ys3, ete. 
Hence, in the final series of interpolated values there will in 
general be discontinuities at y2, ys, ys, ete. It is possible to 
adjust whatever interpolation formula is employed so that any 
two interpolation curves will have the same slope at the point 
of their intersection, that is, at the point which constitutes the 
“end” point of one interval and the “ beginning” point. of 
the next interval. Interpolation based upon such a scheme is 
called tangential ' interpolation. 


‘If two interpolated curves are required to have not only the same 
slopes but also the same curvatures at their points of intersection, the 
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The scheme will now be considered in connection with 
Newton’s formula. 
We shall assume the tangential formula to be of the form 


Ye Os oe-eredg, ok 508 5 (A) 


and our immediate problem is to determine the coefficients 
OO ec ang d. 
If the corresponding curve passes through yi and y2 we 


must have 
Yr=a—b+e-d=yotAyo.-. . . . 2 . (B) 


y2=a+2b+4c+8d = yot2Ayo+A7yo. ss (C) 


and 


Next we require that the slope of curve (A) at yi shall be 
the same as that of the parabola through yo, yi, and ye or 


a(x—1 
Y2= yotxrAyot ee A?yo, 


at the same point. It is easily verified that this requirement 
leads to the relation 


b+2c+3d=Ayo+4A7yo. bom Ot YY laa (D) 


Finally, we require that the slope of curve (A) at ye shall be 
the same as the slope of the parabola through yi, yg and y3 
whose equation is the same as thatof the parabola through 
yo, y1 and y2, except that yo and its differences should be replaced 
by yi and its differences. The value of the slope at ye is then 
Ayi+4A?y1. If we express the value of this slope in terms of 
yo and its differences, this requirement leads to the relation 


b+4c+ 12d = Ayot 3A?yo+fA%yo. 5 a 6 (EF) 


interpolation is called osculatory, because two such adjacent curves are 
said to have a common osculating circle at such a point. Osculatory 
interpolation, however, will be found to require fifth differences, and as 
fifth differnces are ordinarily inappropriate for our purposes we shall 
give our sole attention to tangential interpolation, which calls for only 
third differences. 
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Solving relations (B), (C), (D) and (£) simultaneously for 
a, b, c and d and substituting these values in equation (A), we 
obtain the tangential formula 


te Nyy fas 


Yz= yotrAyo+ 5 


If we refer to the values yo, yi and y2 for the present as 
“ the first set’? and yi, y2 and y3 as “ the second set” for the 
interval from y; to y2, it should be evident that in interpolating 
in that interval by means of the differences of yo, yi, ye and yg 
by formula (18) the slope of the curve at y: is determined by 
the first set of values and the slope at yz by the second set. 
Likewise, in interpolating in the succeeding interval from ye 
to y3, the slope at yg is determined by the first set for that 
interval which, however, is identical with the second set of the 
preceding interval (that is, from y: to y2); hence, the interpo- 
lation curves of the two intervals must have the same slope at 
their point of intersection y2; and so on for all succeeding 
intervals. 

The leading differences of formula (18) for interpolating 
t—1 values in the interval from y; to y2 are as follows: 


Ayo , t+1 A*yo t—1 A 
Gyan +1 Ayo Yo 


Lr was eee ae 
A? > A 
(2) = 1 —3— (t—2) ~e. me onceatta 
A8 
(3) = + 3 


(4) =(5)= ete., =0 
For t=5 these leading differences become 
(1) = .2Ayo+ .12A?yo— .016A3 yo 
(2) = OE Ay O1L6ASy6.0 0 ee 10h 
(3) = O24 A3 yo. 
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27. Interpolation of Ordinates Among Areas.—So far, 
interpolations have been concerned entirely with ordinates; 
that is, both the data and the values to be interpolated have 
been essentially ordinates. We shall now derive a useful 
formula to be used for interpolating ordinates where, however, 
the data consist essentially of areas. Suppose, for example, 
that it is desired to estimate the value of crude materials, for 
use in manufacturing silk, imported into the United States for 
the single year 1907, from the values for the following groups 
of years: 


Millions 

of Dollars 
1900-4 1479 
1905-9 2096 
1910-4 2902 


It is evident that no formula derived so far will enable us 
to determine the desired value. We shall return to this par- 
ticular problem later. 

Suppose that uz and yz are two functions which have the 

t t 


relation 
Ux = AYz 1 ie SV io Cee, Ue | oe (A) 
t t 4 t 
Hence 
z+t-—1 
Yxrtt = D2) Ux +Yz. 
at. z=2 t t 
Also, let 
Wr =U +urzit ate 6 + Usr+t—1- 
ip ie =p t 
Hence, 
Weds ay ee ha on awe reg ALY) 
t t t 


Substituting in (B) 
z=0 yi-yo= AYyo=Wo 
Ayo = Awo 
z=t ye-yi= Ayi=v1 A’ yo = A? wo 
A?y, =Aw1 
Z=2t yz—ye2= Ay2=we 


etc., 


46 INTERPOLATION 


If now we expand yzr+1 and yz by Newton’s formula and 
t t 


take the difference between the results, in accordance with 
relation (A), replacing Ayo by wo, A°yo by Awo, etc., we 
obtain 


Sta er oe 


Sucka eo (20) 


For t=5 formula (20) becomes 


set  A2wo 


Fa eC —92+12)5 Bat pie 20} 


_ Wo 
Ln = 3 te 


“3 


Now, the values wo, wi, we, ete., are obviously groups of { 
values of which the values given in the problem suggested 


above are examples, and uz represents one of the values 
t 


included in such a group whose identification depends upon 
the value of x; in the problem given above (where t=5) wo, 
5 


ui, ete., represent the values corresponding to the individual 
5 


years 1900, 1901, etc., and wo +wi+ ... +s =wo= 1479, ete. 
5 5 


5 


Hence, 
Wo = 1479 
Awo = (in 
w1 = 2096 A?wo = 189 
806 
We = 2902 


and the value corresponding to the individual year 1907 
(for v=7) is 


= .2(1479) + .2(617) — .008(189) (to second differences) 


“ 


=417 .688 or 418. 


The leading term (why does this case require a leading 
term?) and differences for breaking up completely the group wi 
(given wo, Ww: and wz) are as follows: 
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2) a2 
Leadi _ Wo (é+1) Awo (1—#?) A?wo 
eading term ; 4 9 2 = 3 
Awo A2w 
= A2wo 


It is easily verified that the leading term and differences 
for the values given above are 363.792, 26.192 and 1.512 respect- 
ively. If these values are written and added accumulatively 
as follows, we obtain the values corresponding to the individual 
years as desired: 


Years Values (in millions of dollars) 
1905 363 .792 
a 26.192 
1906 389.984 17612 
27.704 air 
1907 417.688 1512 
29.216 
1908 446 .904 1.512 
30.728 
1909 477 .632 
2096 .000 


The work was carried to a greater number of decimal places 
than would be necessary in practice, in order to show a perfect 
check; that is, to show that the sum of the values for the indi- 
vidual years equals w 1 = 2096. 


EXERCISES 


1. Given the consumption of coffee in the United States for the 
quinquennial periods: 


1895- 9 50.03 pounds per capita 
1900- 4 55.90 
1905- 9 54.20 
1910-14 46.75, 


estimate the consumption per capita for the year 1902. 
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2. From the data of problem (1), estimate the consumption per 
capita for the individual years 1900, 1901, . . . 1904. 

3. From the number of patents (in thousands) issued in the 
United States for the years: 


1895-— 9 117 
1900— 4 144 
1905-— 9 170 
1910-14 185, 


estimate the number of patents for the years 1900, 1901, . . . 1904. 
4, The enrollment (men and women) in the colleges, universities, 
and schools of technology of the United States was as follows: 


1895-— 9 437 thousands 
1900— 4 530 
1905— 9 691 
1910-14 963 
Estimate the enrollment for the individual years 1900, 1901, . . . 1904. 


5. The number of American and Filipino teachers in the Philip- 
pine Islands were as follows: 


Years Beginning American Filipino 
1904— 6 2432 14,896 
1907— 9 2357 23,028 
1910-12 2005 23,112 
1913-15 1638 28,371 


Estimate the number of (@) American and of (6) Filipino teachers 
for the individual years 1907, 1908 and 1909. Estimate the number 
of (c) American and of (d) Filipino teachers for the year 1910. 


28. Areas from Representative Ordinates.—Sometimes the 
problem is reversed in the sense that every ¢-th individual 
value is known and it is desired to determine the values cor- 
responding to each group. As in the previous case, second 
differences will probably prove sufficient for ordinary purposes. 
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t t+1 : 
If we substitute P ot Sur a in Newton’s formula we 


obtain respectively 


Ut = up +A, 
t 
Un+1 = uot (+1) + ayy 
21 
ize Uo t (+2) 420-4) 
t 
ess = ug + (+3) 438-4), 


t 
A2uo 
ie 


a 


te sel) et) 


u2t nuns 2840 
t 


The sum of the first ¢ lines gives w: and the sum of the 
t 


Ut1-1=Uot (t+t—-1 
t 


Auo 


last ¢ lines gives EEL Or 


RAN et (51? — 6i+1)4 


=tuo ate (22) 


W1 
Ww 


A2u0 


2 
i 
Auot$(5? + 61+ D+ ete. . (23) 


ee =tuo+ 


Formula (22) is called an inztial form and formula (23) a 
terminal form, for obvious reasons; their use is made clear by 
the following illustrations: The values of the products of silk 
manufactures (in thousands of dollars) of the United States 
are given by a certain authority only for individual years; 
particular values are as follows: 


Year Values (in thousands of dollars) 
1899 107,256 
1904 133,288 
1909 196,912 


1914 254,011 
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Suppose that it is desired to estimate the total products for 
the quinquennial period 1905-9. It is evident that the terminal 
form of the formulas derived above is needed. Differencing 
the given values and substituting in formula (23), we obtain 


we =5(107,256) +8 (26,032) + 2.6(37,592) 
5 


= 842,275 


If the values were given for the years 1900, 1905, etc., the 
initial form would obviously be required. 

In order to interpolate the values for “end” groups (for 
example, for 1910-14 or 1900-4) it is necessary to use one of 
the formulas 


tas 


wo = tuo + Awo+=5 1 suo. ; eee ae 


iS Rar ot auot +E suo, c .7 (Gap 


where (24) is obtained by adding the first ¢ values and (25) by 
adding the last t values obtained by substituting 0/t, 1/t,... t/t 
in Newton’s formula. It should be noticed, however, that for 
the data of this problem the total value of the products for the 
period 1910-14 would be obtained by reversing the order of 
the given values (before differencing) and applying the initial 
form (24), 


EXERCISES 


1. The value of silk products (in thousands of dollars) in the 
United States, previous to 1904, were known only for the years: 


1869 12,211 
1879 41,033 
1889 87,298 
1899 107,256. 


Estimate the value of the products for the period 1880-9. 
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2. From the enrollment in colleges, universities, and schools of 
technology of the United States for the following years: 


1900 98,923 
1905 120,099 
1910 163,019 
1915 237,011 


Estimate the enrollment for the period 1905-9; the period 1910-14. 
3. Same as problem (2) but using the following data: 


1899 92,385 
1904 111,688 
1909 161,808 
1914 216,636 


4. Estimate the total consumption of sugar per capita in the 
United States for the period 1905-9, from the figures: 


1900 58.81 pounds 
1905 71.55 
1910 79.77 
1915 86.84 


5. Estimate the consumption of sugar per capita (from data of 
problem (4)) for the period 1900-4; for the period 1901-5. 

6. Estimate the consumption of sugar per capita for the period 
1911-15. 


EXERCISES? IN CONSTRUCTING ABRIDGED 
MORTALITY TABLES 


1. The population statistics (decennial) of the whole country, and 
the mortality statistics (annual) of the so-called registration area, are 
published only by age groups, because of the concentration at ages 
which are multiples of 5 when the statistics are collected by single 


2 These Exercises are perhaps a little too specialized to be included 
in a short course. 
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ages. The population statistics (Lz) for females of ten registration 
states for 1920, for decennial age groups, are in part as follows: 


Age x 15, 
5-14 2,529,809 

15-24 2,350,086 

25-34 2,348,953 
ete. 


Interpolate the populations for the single ages 10 and 20. 
Ans. 250,921 and 233,842. 


2. The mortality statistics (d,) for females of ten registration 
states for 1920 are in part as follows: 


Age x dz 
5-14 6,493 

15-24 10,609 

25-34 15,866 
etc. 


Interpolate the number of deaths for the single ages 10 and 20. 
Ans. 662 and 1080. 

(3) Compute the leading term and differences of the population 
statistics of Exercise 1, and interpolate the populations for all of the 
single ages 15-24. 

(4) Same as Exercise 3, but for the mortality statistics of Exer- 
cise 2. 

5. As the population (Z,) for any single age, as obtained from the 
federal statistics, is regarded as referring to the population at the middle 
of the calendar year, the population at the beginning of the year is 
approximated by adding one-half of the deaths that take place during 
the year. The death rate at any age is computed then by the relation 


BE Ste 
9°" Lethe 
and the probability of living one year by the relation, 
ta Lie a $d, 
Leth, 


The values of p, were computed by the latter relation by logarithms, 


Pr=1—Qz 
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from statistics of which those given in the preceding exercises are a 
part, as follows: 


Age x log pz 
10 9.9974-10 
20 9.9954-10 
30 9.9931-10 
ete. 


Since the probability of living n years 


nPc=PaePr+iPr+2.-+- Dz+n—1; 
then 
log npz=log pzt+log pryit ... +log prtn—1- 

Find the logarithm of the probability of living ten years for ages 

10 and 20. (Hint: Use formulas (24) and (22)). 
Ans. 9.9821—10 and 9.9377—10. 

6. If we assume any specified number of individuals to be living 
at an early age, say age 10, we need only multiply by successive values 
of »pz, say for n=10 and for ages 10, 20, 30, etc., to obtain the number 
of survivors (Jz) at isolated ages, say 20, 30, ete. Such a table of 
survivors for all ages, say 10, 11, 12, etc., constitutes what is called a 
mortality table; and a table of survivors at isolated ages, such as 10, 
20, etc., is called an abridged mortality table. Wide variation in the 
death rates for ages in the neighborhood of the age of birth practically 
necessitates the omission of ages much before age 10 in the construc- 
tion of abridged mortality tables; moreover, if the number assumed 
to be living at the earliest age, called the radix, is taken to be no 
greater than 1000, much trouble will be avoided in trying to trace the 
last survivors at the highest ages, which would, in any case, have no 
appreciable effect upon the most important or earlier parts of the 
table. The abridged mortality table constructed from the statistics 


represented above is as follows: 


Age x the Age x (be 
10 1000 60 646 
20 965 70 438 
30 913 80 183 
40 847 90 24 
50 768 


Check the values of day) and Js0. 
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7. Mortality tables are almost always accompanied by a column 
of values of what is called the expectation of life (ez). The formula 
for computing the expectation of life at any age ris 


1 Ieyitle+2+ to end of table 


2 ly 


The expression on the right, omitting the 3, would give the expec- 
tation of life (called the curtate expectation of life) if all survivors 
died at the very beginning of the year of death; as the deaths are 
much more likely to be distributed uniformly throughout the year of 
death, it is usually assumed that the average person lives half a year 
in the year of death and so 4 is added. 

If formula (23) were applied to the abridged mortality table given 
above (formula (25) should be used to interpolate the first value), 
the values of /, obtained for the age groups (11-20), 21-30, 31-40, etc., 
when added accumulatively, beginning with the higher ages, would 
give successive values which, when divided by the corresponding 
values of J, for individual ages, would give values of the curtate 
expectation of life. Some method of determining the value of 1, for 
the age group 91-100 would have to be devised, but no great concern 
should be felt in this determination (graphical methods would be 
satisfactory) because its value can have no appreciable effect upon 
the values of the expectation of life at the earlier and most importent 
ages—the only ages at which the statistics are usually sufficiently 
reliable. 

Check the following work: 


Curtate 
Age [ies Ue 4-1) — (+10) > Expectation Cx 
10 1000 9821 52,395 52) 4. 52.9 
20 965 9377 42,574 44.1 44.6 
30 913 8779 33,197 36.4 36.9 
40 847 8046 24,418 28.8 29.3 
50 768 7044. 16,372 PANS 21.8 
60 646 5388 9,328 14.4 14.9 
70 438 3017 3,940 9.0 OFS 
80 183 875 923 5.0 Dab 


90 24 48 48 
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8. The population and the mortality statistics for 1920, of the 
males of the ten states which were registration states in 1900, are as 
follows: 


Ages Population Deaths Age Cx 
5-14 2,535,630 7,582 10 62.3 
15-24 2,236,577 9,979 20 44.0 
25-34 2,396,321 14,681 30 Ome 
35-44 2,029,859 16,816 40 28.4 
45-54 1,562,933 20,348 50 21.0 
55-64 963,954 25,602 60 14.3 
65-74 488,935 28,525 70 9.0 
75-84 169,295 21717 80 5.2 
85-94 23,494 6,046 
95- 940 327 


Construct an abridged mortality table for decennial ages and 
check the column of values of expectation of life given to the right 
above. 


CHAPTER IV 
GAMMA AND BETA FUNCTIONS 


29. The Gamma Function: Integrations by Substitution: 
Indeterminant Forms.—There are several types of definite 
integrals in the ordinary calculus which do not conform strictly 
or definitely to the definitions laid down for the ordinary types. 
Most of these special types are included under and referred to 
as improper integrals and refer graphically to areas which, 
though finite, stretch away to infinity in at least one direction. 
One of these types, known as the Gamma function, is unusually 
valuable in the mathematical theory of statistics, especially in 
the treatment of frequency curves; and even though we give 
no attention here to the systematic treatment of frequency 
curves, some knowledge of the function is highly desirable. 
Moreover, a certain development of the function will lead to an 
important relation between certain forms of the function and 
the definite integrals of the equation of a very important curve 
—the normal curve—to be considered later. It should be 
emphasized, however, that the treatment given here is neces- 
sarily elementary. 

The Gamma function may be defined by the definite integral 


Tint) = f aretae, see Gaeta) 


Reference to any good textbook! on the calculus will 
reveal the fact that, although the area under consideration 
stretches away without limit to the right, the values of the 


‘See Byerly’s ‘Integral Calculus,” p. 86. Sometimes the Gaussian 


symbol x(n) is used instead of the symbol | (n+1) which is due ta 
Legendre. 
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Gamma function are finite and determinate for all finite values 
of n greater than —1 (and infinite otherwise). 

There are many definite integrals which bear no apparent 
resemblance to that given by (26) and which are, nevertheless, 
essentially Gamma functions, as suitable substitutions will 
show. The student is supposed to be familiar with the process 
of integration by substitution, but the process is so important, 
particularly in this chapter, that a typical example will be 
treated in detail. The process to be emphasized here is so 
definite that the student should have no trouble in handling 
similar integrations which follow. As an example, we shall 
substitute «=y/a in the following integral to obtain the more 


general relation 
i are da =! ie), Ee BO Bet eres) he 0-18) 


qrtt 


In making this substitution it is simply important that the 
substitution be made properly effective in three places: the 
integrand, the differential, and the limits. 


The integrand evidently becomes me and the differential 


d 
dx becomes af 


The limits on y corresponding to the limits 0 and o on a, 
obtained by substituting these limits for z in the equation 
y=azx, are likewise 0 and «. Making these substitutions we 


have : A) 
ea) co n 
il edt = | yre-"dy =! pee 


Another important relation is obtained if we integrate (26) 
by parts (with u=2”, etc.) to obtain 


es a 
jj we "da = (—are-)? tn ff gi tear. 
0 0 A 


Here we need to refer to another important problem in the 
calculus—the valuation of indeterminate forms. It will be 
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: ; : (Owe 
recalled that expressions which take either of the forms Go ce 


for some value of the independent variable can usually be 
valued by taking the derivative of the numerator for a new 
numerator and the derivative of the denominator for a new 
denominator, or by sufficient repetitions of the process. It 
will be seen that the first expression on the right of the equa- 


tion given above takes the indeterminate form = for z=00, 


and the application of the process just outlined shows that the 
limiting value of the expression is zero; its value for x=0 is 
obviously zero. We have then the important relation 


| @+l)=af mje... .°s = Se 
If n is a positive integer 
|] m+1)=nl!. 


Moreover, according to (26) 


mes) =f ch oi | 
0 


and it follows from (28) that 
y (@)=1, 


It is evident from formula (28) that if the value of the 
Gamma function is known for all positive numbers located 
between any two successive positive integers, say between 1 
and 2, the value for any other positive number can be easily 
determined. For example, [| (3.36) = (2.36)(1.36)] (1.36). 
Excellent tables? of the logarithms of values of the Gamma 
function of numbers lying between 1 and 2 have been con- 
structed. A small table follows. 


2 Pearson’s “Tables” (7 places). 
Legendre’s Works, Vol. II (12 places). 


.00 
9.9999 
.9783 
. 9629 
.9530 
.9481 
.9475 
.9511 
. 9584 
.9691 
.9831 


ep 
OOnanrrkwnros 
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A Saorr Taste or Vauurs or 10+log | (n) 


.01 
.9975 
.9765 
.9617 
9523 
.9478 
.9477 
.9517 
.9593 
9704 
9846 


.02 

.9951 
.9748 
. 9605 
.9516 
.9476 
.9479 
. 9523 
. 9603 
9717 
. 9862 


.03 


. 9928 
.9731 
9594 
.9510 
9475 
. 9482 
. 9529 
.9613 
.9730 
. 9878 


04 


.9905 
.9715 
.9583 
.9505 
9473 
.9485 
.9536 
. 9623 
.9743 
.9895 


.05 


9883 
. 9699 
.9573 
. 9500 
.9473 
. 9488 
9543 
.9633 
.9757 
.9912 


.06 


. 9862 
. 9684 
. 9564 
.9495 
9472 
. 9492 
.9550 
. 9644 
9771 
.9929 


.07 


.9841 
. 9669 
9554 
. 9491 
9473 
. 9496 
9558 
9656 
.9786 
9946 


.08 


.9821 
9655 
9546 
9487 
9473 
.9501 
9566 
.9667 
.9800 
. 9964 
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.09 


. 9802 
. 9642 
9588 
9483 
9474 
.9506 
9575 
.9679 
9815 
.9982 


In using this table it should be noticed in interpolation that 
there is a minimum value of 10+log |; (m) in the neighborhood 


of n=1.46. 


Find the limiting values of the following: 


1. a2—4 
oa 


x?—16 


2. 


x?+a2—20 


3u2—Ax 


5 for c—=2- 


TOn— 4. 


EXERCISES 


for g= CO” 


3. 


2u072—3x+1 


loge x 
~ 42 


4. 


fone — cof 


Darema Ore —100). 


x 
6. — for z=®. 


28 


7.° x log x for =0. ( 


log x 


Hint: write it ot £2 | 


Ans. 


Ans. 


Ans. 


Ans. 


0. 


3 The expression given in this exercise takes the indeterminate form 
0-0, but this form and others, such as © — ®, 1” (see Ex. 9) etc., can 


0 oo 
usually be expressed to take either of the forms joe: 
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eat 
8. x? log « for x=0. (Writ it oF Ans. 0. 


jx 
9. 1+7= (1+) for r= © 


1 
' log (1+2) 
Hint: pe (1+7) =2 log (+2) mea Fe 
Ans. log (1+7) =], 
or 1+i=e’. 


30. The Beta Function.—Another definite integral which is 
very important in the mathematical theory of statistics, and 
which is closely related to the Gamma function, is the Beta 
function, or the First Eulerial Integral (the Gamma function 
is sometimes referred to as the Second Eulerial Integral), which 
may be defined by the relation 


1 
(m,n) = [2-1 —a)" Yar. io a hn aes 
0 


This integral can be shown * to be finite and determinate 
in value for all positive values of m and n. 

If we substitute 1—y for x in this integral, in the manner 
outlined in the preceding section, we obtain the relation 


Bn, %) =8(m, Mm), 3 A 
which shows that the values of m and n are pera 


If, moreover, we substitute —— ace se and —— on successively for x 


in (29) we obtain 
am 1 
Bim, n) = S Sone a Dee Aces 


pene oF ee * . (31b) 


which are, therefore, merely other forms of the Beta function. 


4 Byerly’s “Integral Calculus,” p. 113. 
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The value of any Beta function for positive values of m 
and n can be expressed directly in terms of Gamma functions 
and can therefore be found by means of a table of values of the 
Gamma function. It happens that this relation between the 
Beta function and the Gamma function can be obtained by 
merely valuating a certain double integral in two ways and 
equating the results on the assumption that the order of inte- 
gration (i.e., first with respect to x and then with respect to y, 
or first with respect to y, etc.) is immaterial. To value the 
integrals in this case with respect to either variable and in 
either order, one needs merely to concentrate his attention 
upon the essential factors and recognize the combination 
necessary at the time as constituting the Gamma function or 
the Beta function—as the case may be. The double integral 
is as follows: 


i°,¢) [o.0) 
if ji Cee aU eee dy. 
0 0 


lt is easily verified that the integral with respect to x 
(treating y-terms as constants) is 


emt) fo aadly =P (mtnyaimyn). . (A) 
If, however, we integrate first with respect to y we obtain 
To) [miedo ToT Om). =... B) 
Equating (A) and (B) we obtain 
B(m, n) “Tar. ee, LP) 


If we let m=n=34 in (31a) or (316) we obtain a form which 
we can easily integrate to give 


Bia, 2) =. 


But, according to (82), 
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Hence, 


(o.¢) 
Ta) = foo tetde= Ve. 
0 
It is easily verified that if we substitute x? for a in the last 


integral we obtain 
7G)= |) e-“dr= 
0 


Now y=e-” is the simplest form of the equation of the 
normal curve, which we shall consider at some length later, 
and the fact that x occurs only to the second degree shows that 
the graph is symmetrical with respect to the y-axis. Hence, 
twice the area under the curve from x=0 to r= is the same 
as the area under the whole curve and we may write the rela- 
tion given above as follows: 


Jie*dr= Vr. Kc? ok Seal ee ee 


If, in addition, we substitute 2/Wh for z in (33) we obtain 
the more general result 


ji ghd oe ee 


— 
which we shall find useful later, in connection with the normal 
curve. 

EXERCISES 


Many valuable exercises will be found in verifying the various 
relations given in the text. 


‘30 
1. Show that i a te~*dr=V x. 
0 


2. Determine the value of x from the relation | (0.5) = Vz. 


T\e= 1 
3. Show that i (108 *) dxt=| (n). 
0) x 


Substitute e~ 7 for x. 
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4. The Psi (y) function is defined as the derivative of the loga- 
rithm of the corresponding Gamma function. Given that 


| (ata) =(a+e2—1)(a+a—2) ... 2] (2), 


where a is a positive integer, show that, 


1 1 1 
Vata)=Va)+i+S5+ 0. to, 
{ Of (m)[ (n) 
| d-+m-+n) ° 
g(l+m, n) _| d4m)[(n) 
B(m-+n,l) | (m+n)| 
7. Show that i - (G—r)" dra ™ Bm), 
0 
Substitute x for x/a. 
1 
8. Show that ji e120") "dx —t ie m). 
0 n \n 
Substitute x” for «. 
1ym—1 sagem" 
0 Uta)" 
Add together formulas (31a) and (316), separate result into 


1 {o.) 
f ff and substitute 1/x for z in the last part. 
0 1 


10. Show that the number of combinations of n things taken r at 


a time or 
n! 1 
10,(=—— =) ~ p(n—r, 7) 


11. Show that g(6, 4) =1/504. 

12. Show that s(1, 1)=1. 

13. Find the values of (a) log 8(2.36, 2.49). 
(b) log (3.4, 3.6). 


5. Show that 8(m, n)B(m-+n, l) a 


6. Show that 


9. Show that dxz=p(m, n). 


CHAPTER V 
PROBABILITY 


31. A Priori Probability: Simple Events.—If a bag con- 
tains three white and five black balls, and one ball is drawn out 
at random, what is the probability that this ball is white? 
The event in question is said to happen if a white ball is drawn 
and to fail if a black ball is drawn. The number of distinct 
ways in which the event may happen is three and the total 
number of possible ways in which it may happen or fail is eight. 
The fraction 2 then is said to be the probability of drawing a 
white ball. This illustrates the following definition of prob- 
ability: 

If all the happenings and failings of an event can be analyzed 
into h+f possible ways, each of which is equally likely, and if inh 
of these ways the event will happen and in f of them fail, the prob- 
ability that the event will happen is Li and the probability that 
ane SS eed 
at will fail is hah 

The chances of the event happening are said to be as h is to f. 

Corollary. The sum of the probability that an event will 
happen and the probability that it will fail is 1, which is the 
symbol for certainty. The symbol for certain failure is 0. 

In applying the definition of probability, the fact should 
not be overlooked that all the ways are assumed to be equally 
likely. To illustrate the need of precaution in this matter, 
consider the question: What is the probability of throwing 
“head” at least once in two throws of a coin? We might, 
give the following as the equally likely cases: HH, HT and TT; 


whence, the probability would be 3. Further consideration, 
64 
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however, makes it clear that the case HT is twice as likely as 
HH or TT because it can happen in two ways, that is, HT or 
TH. The probability desired is therefore 3. 

The probabilities referred to above are said to be determined 
a priori and are often referred to as a priori probabilities to 
distinguish them from probabilities which will be considered 
later and which can not be determined by an a priori analysis 
of the various possibilities. 

We shall refer to events which can only happen or fail as 
simple events to distinguish them from events which involve 
other possibilities. 

Sometimes the formula for the number of different combi- 
nations of n things taken 7 at a time proves useful in determin- 
ing an a priori probability, especially when the total number 
of ways an event can happen is a large number. This for- 
mula is 

nm(n—1)... (n—r+1) n! 
be: r} a rin—r)Y 


nCr 


As an example, let us determine the probability of drawing 
two white balls and three black balls in drawing five balls at 
random from a bag containing four white balls and six black 
balls. The total number of different ways of drawing five 
10-9-8-7-6 
Sir or 


The number of different ways of drawing two white balls from 


balls from a bag containing ten balls is 19C’5= 


four white balls is 102==5°=6, and the number of ways of 


drawing three black balls from six black balls is 6C3=20; 
hence, the number of ways of drawing two white balls and three 
black balls is 6-20=120. The desired probability is then 
120/252= 10/21. 


EXERCISES 


1. A bag contains 3 red, 4 black and 5 white balls; if 1 ball is 
drawn at random, what is the probability that it is a red ball? A red 
or a black ball? 
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2. Three dice are thrown. 


(a) What is the probability of throwing a 7 (the sum of 
the upper faces)? 

(b) Show that the probability of throwing a 14 is five 
times that of throwing a 4. 

(c) Show that the probabilities of throwing a 10 or an 11 
are the same. What is the probability? 


3. If the probability of a certain event happening is 4 times the 
probability of its failing, what is the probability of its happening? 

4. What is the probability of throwing a head in 3 throws of a 
coin? 

5. What is the probability of throwing an ace in 6 throws of a 
die? 

6. What is the probability of throwing exactly 1 head in 3 throws 
of a coin? 

7. A bag contains 4 white and 6 black balls; find the probability 
of drawing exactly 2 white balls out of 5 drawn at random. At least 
2 white balls. 

8. A purse contains 2 dimes, 3 quarters and 4 half-dollars. Assum- 
ing that one coin is as likely to be drawn as another, what is the prob- 
ability that if a single coin is drawn it will be either a quarter or a 
half-dollar? 

9. If 12 students are seated at random in a row, what is the 
probability that A and B are next to each other? 

10. Three balls are drawn at random from a bag containing 5 black 
and 4 white balls. What is the probability that 2 are black and 1 
white? 

11. Two cards are drawn at random from a suit of 13 cards. What 
is the probability that the 2 cards are an ace and a king? 


32. Independent Events.—Two or more events are said to 
be independent when the occurrence of any one of them is not 
affected by the occurrence or non-occurrence of any of the rest. 
Thus, the results of two drawings of a ball from a bag are 
independent if the ball is returned after the first drawing, but 
interdependent if the ball is not returned. 
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Theorem.—The probability that all of a set of independent 
events will occur is the product of the probabilities of the single 
events. 

For, consider two such events whose probabilities are a/n 
and b/m respectively. The number of equally likely possible 
cases for and against the first event is n, for and against the 
second m, and since the events are independent any one of the 
m cases may occur with any one of the m cases. Hence, the 
number of equally likely cases for and against the occurrence of 
both events is nm. By the same reasoning, ab of these cases 
favor the occurrence of both events. Therefore, the prob- 


ability that both events will occur is ab/nm or oe The 


demonstration for the case of more than two events is similar. 

Thus, the probability of throwing an ace twice in succession 
with a single die is 3-{=2,. Likewise, the chance of drawing 
a white ball twice in succession from a bag which contains four 
white and three black balls, the ball first drawn being returned 
before the second drawing, is 4+-4=18. 

33. Mutually Exclusive Events.—If two or more events are 
so related that but one of them can occur, they are said to be 
mutually exclusive. Thus, the throwing of a “head” and the 
throwing of a “ tail’’ in the same throw of a single coin are 
mutually exclusive events. 

Theorem.—The probability that some one or other of a set of 
mutually exclusive events will occur is the sum of the probabilities 
of the single events. 

For, consider two mutually exclusive events A and B. 
The possible cases with respect to the two events are of three 
kinds, all mutually exclusive, namely, those for which (1) A 
happens, B fails; (2) A fails, B happens; (3) A fails, B fails. 
Let the numbers of equally possible cases of these three kinds 
be J, m and n respectively. Then the probability of the single 
event A is 

l 


l+(m+n)’ 
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for, since A never happens except when B fails, the J cases in 

which A happens and B fails are all the cases in which A 

happens, and the m-++n cases in which A fails and B happens 

or A and B both fail are all the cases in which A fails. 
Similarly, the probability of the single event B is 


m 
m+(l+n)’ 
Hence, the probability that either A or B happens or 
in l rv m 


lt+mtn I+(m+n)° m+(l+n)’ 

The proof for more than two events is similar. 

Thus, if one ball be drawn from a bag containing 3 white, 
5 black and 7 red balls, since the probability of its being white 
is 1 and that of its being black is 4, the probability of its 
being either white or black is $+4= 4%. 

Care must be taken to apply this theorem only when events 
are mutually exclusive. Thus, if asked to find the probability 
that a problem will be solved if both A and B attempt it, A’s 
probability of success being ? and B’s 2, we can not obtain 
the desired result by merely adding ? and 2, since the two 
events (A succeeds and B succeeds) are not mutually exclusive. 
The mutually exclusive cases are: A succeeds, B fails; A fails, 
B succeeds; A succeeds, B succeeds. The probabilities of 


these events are 73; (=2-3), y (=3-3) and 58 (=3-3) 
respectively; and the sum of these probabilities or 44 is the 


probability that the problem will be solved. This problem 
could also be solved as follows: the probability that both will 
fail is }-}= 7!y; the probability that both will not fail—that is, 
that at least one will solve the problem—is then 1—+),, or 44. 
EXERCISES 


1. If the probability that A will live ten years is % and the prob- 
ability that B will live ten years is 3%, what is the probability that 
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both will be alive after ten years? What is the probability that one 
or the other will be alive then? 

2. A bag contains 2 white, 3 black and 4 red balls. What is the 
probability that a ball drawn at random will be either white or red? 

3. What is the probability of throwing either an ace or a deuce 
in a throw of 2 dice? 

4. A traveler has five connections to make in order that he may 
reach his final destination on time. If his estimates that for each of 
these connections the chances are 2 to 1 in his favor are correct, 
what is the probability of his making all his connections? 

5. If the probability that A and B will survive a certain period is 
% and 3% respectively, what is the probability that: 


(a) one or the other will die in the period? 
(b) exactly one will die in the period? 
(c) both will die in the period? 


6. If the probability that each of n individuals will survive a 
certain period is p, what is the probability that at least one will die in 
the period? 

7. If the probability that the age of a man selected at random 
from a group of men is between 20 and 25 years is 4, and the prob- 
ability that it is between 25 and 35 is 4, what is the probability that 
his age is between 20 and 35? 

8. The probability that A will solve a problem if he attempts it 
is #, and that of B 4. What is the probability that the problem will 
be solved if both try it? What is the probability that exactly one of 
them will solve it? What is the probability that both will solve it? 

9. A bag contains 3 red, 4 black and 5 white balls. Suppose that 
2 balls are drawn at random; what is the probability that: 


(a) both are red? 

(b) both are black? 

(c) both are either red or (both) black? 

(d) they consist of exactly one red and one black? 
(e) they are either black or red? 

(f) both are white? 
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(g) that exactly one is white? 
(h) at least one is white? 
(i) Solve (e) using results of (f) and (g). 


10. If the probabilities of A, B, C and D surviving a certain period 
are &, Z, 8 and 3°, respectively, what is the probability that at least one 
of the four will die in the period? 

11. If the probabilities of A, B, C and D dying in a certain period 
are 4, 1, 4 and +5 respectively, what is the probability that at least 
one of the four will die in the period? What is the probability that 
all will survive the period? 


34. Empirical Probability: Homogeneous Populations.— 
h 
h+f’ 
of an event, means little so far as the actual outcome of a single 
trial or a small number of trials of the event is concerned. It 
should, however, indicate the frequency with which the event 
would occur in the long run, that is, in the course of an indefi- 
nitely long series of trials. Thus, if one should try the experi- 
ment of throwing a coin a very great number of times, say 
several thousand times, one would find that, as the number of 
throws increases, the ratio of the number of times that a head 
appears to the total number of throws approaches the value } 
more and more closely and steadily. In general, if h be used 
to refer to the number of times a certain event occurs and n 
refers to the number of trials, and p the corresponding prob- 
ability, then the value of this probability may be defined by 

the relation 


The fraction which we have called the a prior? probability 


— * hme 
a Pe eos 
Or the value of the probability in question may be defined 
as the limiting value of the ratio h/n as n, the number of trials, 
increases without limit. Such a probability is called an 
empirical or a posterior? probability. 
Now, it is only proper that we approach the subject of 
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probabilities from the point of view of a priori probabilities, 
or probabilities that are associated with games of chance; but 
the student will soon observe that the vast majority of the most 
important problems in determining the values of probabilities 
in practice require an entirely different mode of procedure. 
Thus, if we wish to determine the probability that a man aged 
30 will die in the following year, we have no way of analyzing 
all the happenings and failings which are equally likely; in 
fact, we know that the chances of a number of men aged 30 
dying in the following year are not at all equal. We have 
noticed, however, that whatever the value of the desired prob- 
ability may be, it will tend to be exemplified as the number of 
triais increases indefinitely. This fact suggests that such a 
probability be determined empirically. 

The author wishes now to call attention to a very impor- 
tant distinction. The total field of events, or population, of a 
given investigation may be far from homogeneous; and since it 
is obviously impossible to consider all events in determining an 
empirical probability whose population is infinite, we are left 
with the sole possibility of taking what we shall call a random 
sample of the whole population, hoping to arrive at an approxi- 
mation of the desired probability. If the discrepancy proves 
to be no greater relatively than that which we usually find in 
properly conducted games of chance, the sample is said to be 
representative of the whole population. If the discrepancy is 
obviously excessive, we conclude either that the sample was 
not selected at random or that the population is not sufficiently 
homogeneous. We shall find, as we proceed, that we can 
control the discrepancies or errors due to random sampling in 
a homogeneous population; but we can not always be sure that 
we have a random sample, and it is particularly difficult to 
insure that we are dealing with a population which is sufh- 
ciently homogeneous. As it is characteristic of the great 
majority of empirical investigations in various fields that the 
results must be obtained from a sample, the question of the 
homogeneity of the population is exceedingly important. In 
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the absence of definite knowledge of the homogeneity of a popu- 
lation, the wise investigator will regard new results merely as 
tentative and subject to check by results of similar character. 

A method of investigating the homogeneity of a given 
population will be considered in a later chapter, but it will be 
necessary to assume for the present that the populations con- 
sidered in the immediately succeeding chapters are homo- 
geneous. 
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1. We shall denote the probability that a person aged x will survive 
nm years by »npz and the product npz-nPy by nPry. Given, then, that 
two persons are aged x and y, express the probability that: 


(a) both will live n years; 

(6) both will die within n years; 

(c) both will not live n years; 

(d) both will not die within n years; 
(e) at least one will live n years; 

(f) at least one will die within n years; 
(g) exactly one will live n years; 

(A) exactly one will die within 7 years. 


2. Given three persons aged x, y and z, express the probability that: 


(a) all three will live n years; 

(b) all three will not live n years; 

(c) at least one will die within n years; 
(d) at least one will live n years; 

(e) at least two will live n years; 

(f) at least two will die within n years; 
(g) all three will not die within n years; 
(h) exactly two will live n years; 

(¢) exactly two will die within n years; 
(7) no more than one will live n years; 
(k) no more than one will die within n years. 
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3. To what events do the following probabilities refer? 


(@) 1—2 Day; 

(6) 1—npeye; 

(co) = ape) 1—10y)3 

(d) 1—A—npz)(1—npy)(1—npe); 
(€) nPatnDy (criticize); 

(f) n-1Pe~ nPz} 

GQ) »nPall—npy)+nPy(1—nDz); 
(h) nPa—nPzy} 

(@) nDz-2Day (Criticize) 

j) nPatnDy—nPay; 

(k) nDatnPy—2nDaoy- 


35. Repeated Trials of a Single Event.—The following 
theorems are concerned with the question of the chance that a 
certain event will occur a specified number of times in the 
course of a series of trials, the chance of its occurrence in a single 
trial being known. 

Theorem.—If the probability that an event will occur in a 
single trial is p, the probability that tt will occur exactly r times in 
the course of n trials is ,Cp’q”~’, where q=1—p. 

For, the probability that it will occur in all of any particular 
set of r trials and fail on the remaining n—r trials is p’q"~’. 
But since there are n trials all told, we may select this particular 
set of r trials in ,C, ways, which are, of course, mutually exclu- 
sive. Hence, the probability in question is ,C,p’q"~’. Are all 
the events of such a particular set of r trials independent? Is 
this necessary? Explain. Why must the various sets of r 
trials be mutually exclusive? 

Thus, the chance that ace will turn up exactly twice in 5 
throws with a single die, or that out of 5 dice thrown simul- 
taneously exactly 2 will turn up is 5C2(4)?(8)? or 82.5. 

Observe that ,C,p’q"~’ is the term containing p’ in the 
expansion of (p+q)”._ What is the number of the term? 
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Theorem.—The probability that such an event will occur at 
least r times in the course of n trials is the sum of the firstn—r+1 
terms in the ecpansion of (p+q)", namely 


po +,Cip"‘q4+,.Con* AP + ... $099". 


For, the event will occur at least r times if it occurs exactly 
r times or exactly any number of times greater than r, and the 
terms of the expansion given above refer to mutually exclusive 
events and represent the probabilities of the occurrence of the 
event exactly n, exactly n—1, .. . exactly r times, respectively. 
Thus, the chance that ace will turn up at least 4 times in the 
course of 5 throws with a single die is (4)°+5(4)4(8) =333.. 


EXERCISES 


1. What is the probability that, in 6 throws of a coin, at least 3 
will be heads? 

2. Five coins are thrown. What is the probability that exactly 
2 of them are heads? Ans. $4. 

3. Find the probability of throwing at least 8 in a single throw 
with 2 dice. 

4. If A’s probability of winning any single game against B is 3, 
find the probability of his winning at least 3 games out of 7. 

5. Show that the probability of throwing exactly r heads and n—r 
tails in a single throw of 7 coins is ,C;+2”. 


36. Cogent Reason and Insufficient Reason.—The author 
wishes now to call attention to a distinction between two 
principles of logic which are often confused in determining the 
values of probabilities. The distinction is very subtle and can 
not be established sufficiently rigorously to insure that one can 
recognize it clearly in all situations. The best that can be 
done here is to illustrate. We say that the probability of throw- 
ing a “ head” with a single coin is 4, because we believe that 
the throw is just as apt to prove “ head ” as “ tail” or “ tail ” 
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as “head.” But suppose that we have no more reason to 
believe that a certain book contains pictures than that it does 
not contain them; are we justified in assuming that the prob- 
ability that the book contains pictures is 4? The value of 
the former probability is said to be determined by cogent reason 
and the value of the latter probability—if it were acceptable— 
by insufficient reason. It should be obvious that, however 
subtle the situation may be, there is a little positive information 
in examples like the former, which removes them entirely from 
the class of examples like the latter, about which we know 
absolutely nothing. We simply say that examples of the latter 
type lie outside the domain of probabilities. 

It is usually easy to establish the absurdity of a particular 
result of insufficient reasoning. To revert to the example con- 
sidered above, we would be just as much justified in saying 
that the probability that the book contains pictures of a certain 
type is 4 and, hence, that the probability that the book does 
not contain pictures of that type would also be §. Similarly, 
the probability that the book does not contain pictures of a 
second type would also be 4, and so on. The probability, 
then, that the book does not contain pictures of any of a large 
number of types would be a large power of 4 or a very small 
fraction. Finally then, the probability that the book contains 
pictures of at least one type would be unity minus this very 
small fraction, or approximately unity—a value quite different 
from 4 which we assigned to the probability originally—and 
the contradiction is obvious. 

The distinction made above is of great fundamental impor- 
tance in the analysis of the observations in many investigations. 
As a simple example, if we were particularly anxious to obtain 
a very accurate reading of a finely graduated barometer, we 
should probably make a large number of readings and select 
the most probable value by methods to be introduced later, 
which would depend fundamentally upon cogent reason or 
belief that deviations to one side of the correct value are just 
as likely as deviations to the other side. 
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SOME FAMOUS PROBLEMS AND FALLACIES 


1. D’Alembert believed the probability of throwing heads at least 
once in 2 throws of a coin to be 3. Criticize. 

2. Leibnitz thought that a throw of 12 with 2 dice was as probable 
as a throw of 11. Criticize. 

3. Would the fact that a coin has been thrown to give heads several 
times in succession affect the probability of obtaining heads in the 
next throw? D’Alembert believed that it would. Moreover, Beque- 
lin believed that if heads had been thrown n times in succession the 

1 
n+1 

4. In seeking the probability of obtaining either 3 heads or 8 tails 
in a single throw of 3 coins, it has been reasoned that of 3 coins at least 
2 must show heads or tails, and the probability that the third coin will 
be the same as the other 2 is 4, and that the desired probability is 
therefore 4. Locate the fallacy. What is the probability? 

5. The famous problem known as the “ martingale” consists in 
determining the relative chances of a poor man and a rich man who 
engage in a game of chance, in which the poor man continues, until 
he loses, to stake all he has against a like amount with the even chance 
of losing all or doubling his fund. The game ends, of course, when- 
ever the poor man once loses. Cardan is said to have shown that the 
condition of play imposes a great disadvantage on the rich man, but 
we have no traces of his method of reasoning. What is your opinion 
of the relative chances of the two players? What does the rich man 
gain when the poor man once loses? 

6. The famous St. Petersburg problem may be stated essentially 
as follows: A coin is tossed until head is obtained. Peter is to pay 
Paul 1 dollar if head appears for the first time on the first toss, 2 
dollars if on the second toss, and, in general, 2”~' dollars if on the 
n-th toss. What is Paul’s expectation,’ or what should Paul pay 
Peter at the outset so that the play will be fair to both? Show that 


probability that the next throw would yield a head would be 


‘Mathematical expectation of an event is defined as the product of 
the probability of the occurrence of the event and the amount to be gained 
if the event occurs. 
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if a sufficient number of games were played Paul would stand to gain 
any amount, however large. 

7. Assuming that a certain gambler will win 3 games out of 5 in 
which he plays, and in which he always stakes 4 of his funds against 
an equal amount, show that he must inevitably lose, by showing that 
his fund after playing 5n games would be (22)" times his original 
fund. 


MISCELLANEOUS EXERCISES 


1. An illiterate servant places 12 similar books on a shelf. What 
is the probability that 3 volumes of a set are together? What is the 
probability that they are together in their proper order? 

2. Show that the probability of a leap year containing 53 Sundays 
is 


salto 


3. If 4 cards are drawn from a pack of 52 cards, show that the 
133 
49-25-17 

4. Two of 8 keys on a ring will open a certain door. What is 
the probability that the door can be unlocked by 1 of 3 keys selected 
at random? 

5. Each student in a certain class of 10 is likely to make a suffi- 
ciently correct observation of the sun’s transit 1 time out of 4. What 
is the probability that a given transit will be correctly observed by 
the entire class? What is the probability that it will be correctly 
observed at all if the entire class attempts it? 

6. A party of 10 attending the opera are to occupy 6 seats in one 
group and 4 seats in another group. If the division is made at random, 
what is the probability of 2 given individuals, A and B, being in the 
same group? 

7. What is the probability of a player in a game of whist holding 
3 aces? At least 3 aces? 

8. One purse contains 6 silver dollars, and 4 quarters. Another 
contains 2 silver dollars and 10 quarters. If a purse is selected at 
random and a coin extracted at random, what is the probability that 
the coin selected will be a dollar? 


probability of there being 1 from each of the suits is 
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9. If 5 books are brought at random from a shelf of 20, what is 
the probability that 3 desired books will be among them? 

10. What is the probability of drawing 4 red balls, 2 white balls 
and 5 black balls, in drawing 11 balls at random from a bag contain- 
ing 7 red, 6 white and 9 black balls? 

11. What is the probability of throwing an ace in 3 throws of a 
die? 

12. What is the probability of throwing a head in n throws of a 
coin? 

13. What is the probability of throwing an ace exactly once in 
3 throws of a die? 

14. What is the probability of throwing a 10 (the sum of the 
upper faces) with 3 dice? Of throwing a 9? 

15. A bag contains 10 times as many white balls as black balls, 
and 1 ball is drawn at random. What is the probability that the ball 
drawn is white? ; 

16. What is the expectation of a gambler who is to win $30 if he 
throws a 17 with 3 dice? 

17. A committee of 4 is to be selected at random from a group of 
3 sophomores, 4 juniors and 5 seniors. What is the probability that 
the committee will consist of: 


(a) 2 juniors and 2 seniors? 
(b) 1 sophomore, 1 junior and 2 seniors? 
(c) 4 seniors? 


18. If 3 dice are thrown, what are the probabilities of throwing: 
(a) 3 sixes? 
(b) 2 sixes and a five? 
(c) a six, a five and a four? 


19. Given that it is an even chance that a certain ship will 
encounter a storm, the probability that the ship will spring a leak in 
the storm is 7/9; if a leak occurs, the chances are 9 to 10 that the 
engine will pump her out, and if they fail the chances are 3 to 4 that 
the compartments will keep the ship afloat; and finally, if she sinks 
the chances are even that a traveler will be saved. What is the 
probability that the traveler will be lost at sea? 


CHAPTER VI 
AVERAGES AND AIDS IN THEIR COMPUTATION 


37. Arithmetic Average, or Mean: The Geometric Average: 
The Median.—One of the main purposes included under the 
general heading of “ analysis of statistics ” is to set up a syste- 
matic method of computing the values of certain terms which 
will serve to describe a sample of numerical observations so 
well that a significant difference between the values of such a 
term corresponding to two samples will permit one to differ- 
entiate between the two fundamental situations. It should 
be obvious that if the value of such a term were sufficiently 
descriptive its quotation might well obviate the necessity of 
exhibiting the entire set of individual observations. The 
importance of the latter statement should be evident when it is 
pointed out that, in the ideal investigation, the number of 
observations will be made as large as possible, because such a 
procedure probably constitutes the best scheme to be followed 
in practice to insure that the value of the term under considera- 
tion shall be representative. 

No one term is at the same time more useful and more 
familiar to the ordinary mind than the arithmetic average, or 
mean, of a set of numerical measurements or observations, 
which may be defined as the sum of the values of the observations 
divided by their number. As an example, it is easily verified 
that the arithmetic average of the following observations is 


et or 4.38. 
200 3.76 4.16 4.40 4.72 5.00 
3.32 3.80 4.16 4.40 4.72 5.00 
3.68 3.92 4.16 4.44 4.76 5.08 
Bo 3.92 4.28 4.60 4.88 52S 
3.72 4.08 4.36 4.64 4.96 5.40 
3.72 4.12 4.40 4.68 5.00 Onike 
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A statement of the meaning, etc., of the observations given 
above—which is immaterial for the present purpose—will be 
ma le la ‘er in another section. 

It will be noticed that several of the values given above are 
the same. It should be evident that the same value would be 
found for the mean if, instead of summing the individual 
values, each value were first multiplied by the number of times 
it occurs and all such products were summed. Such an average 
is called, for purposes of distinction, the weighted arithmetic 
average. The set of observations given above would scarcely 
justify such a procedure, but we shall find that a vast majority 
of the arithmetic averages computed in practice will be essen- 
tially weighted arithmetic averages. Suppose that a set of 
observations were to occur with frequencies as indicated below: 


Observations Frequencies 


4.16 6 4.16X6= 24.96 
4.20 7 29.40 
4.40 10 44.00 
4.46 9 40.14 
4.52 4 18.08 
; 36 36)156.58 

4.35 


It would be only natural to compute the wezghted arith- 
metic average in such a case. The natural method of computa- 
tion is shown to the right. It should be noted that the value 
obtained (4.35) would be the same if it were computed strictly 
in accordance with the original definition of an arithmetic 
average. 

Sometimes observations are weighted in exactly the same 
way as that illustrated above but for a different reason. For 
example, the five different observations suggested above might 
have been obtained in such a way that they would not be 
equally accurate, and there might be some reason why we should 
like to weight them in accordance with, say, the numbers which 
indicate frequencies above. Such a scheme is very common. 
For example, the various grades of a student—quizzes, recita- 
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tion, final examination, ete.—are almost invariably weighted 
in some way or other. 

The term “ average” is used quite widely to refer to the 
arithmetic average, and we shall follow that custom fairly con- 
sistently in the future, but there are other kinds of averages 
which would be preferable to the arithmetic average in some 
situations. The geometric average of a set of observations is 
obtained by multiplying the observations together and extract- 
ing the root corresponding to the number of the observations. 
For example, suppose that the population of a certain com- 
munity gains by 4, } and 2 of its population in three successive 
years, respectively. The average (geometric) annual rate of 
gain would be 


) 


Sa 
ViIxXEX2 =1. 


Sometimes it is desirable to weight the observations in 
determining the geometric average by raising each observation 
to its respective weight as a power. 

The median of a set of observations is the middle observa- 
tion when all the observations are ranked or arranged in order 
of magnitude. The median has the advantage, as an average, 
in that it requires no computation. It is easily found by 
inspection that the median of the set of observations given at 
the beginning of this section is 4.40. It will be noticed that, 
strictly speaking, there is no middle term in this case since 
there is an even number of observations; but since the value 
on each side of the middle is 4.40 we naturally select that value 
for the median. The median divides the number of observa- 
tions into two equal or nearly equal parts. Similarly, quartiles 
divide the number of observations into four parts, percentiles 
into one hundred parts, etc. 

Occasionally the harmonic mean is used, which is defined as 
the reciprocal of the arithmetic average of the reciprocals of 
the observations. 

Certain other forms of averages will be mentioned in later 
sections. Since we shall give most of our attention to the 
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arithmetic average and certain other forms which are closely 
related to it, it should perhaps be stated here, once for all, 
that many cases arise in practice where other averages would 
be greatly preferred for one reason or another. We shall 
confine most of our attention to certain averages, such as the 
arithmetic average, mainly because of their possible mathe- 
matical content. 


EXERCISES 


1. According to Vilmorin’s tables, the following seeds live approx- 
imately the designated numbers of years: 


Beaty years s wees 3 Her Plantae. s 6 Peas 2. eavanne 3 
Beet aecrre este 6 Bindive puree 10 Peanuts. 1 
Broccoliess- 5 Kohinabiac see 5 ‘Pepper. occ ee 4 
Cabhages-n asec 5 Leeksy ae ae. 58 3 Purp kin gee ae + 
@airotaaemeiee 4 Lemtilsae eee ct 4 Redishsee eee 5 
Cauliflower. .... in Letttice. 5. ...5- 5 Rapes... suse 
Celery. 8 Muskmelon..... 5 palsifyen wees eee 2 
Ghicoryersrsenie 8 Nasturtium..... 5 eye elkee ANA Go 5 
Gorn smectic 2 Osler aye ace 5 Squash... eer 6 
Corn Salad..... 5 Oni gna e eee 2 Tomatoes: . see 4 
Cress een seo PREM yet eae 2 BB bieary ee Re a. 5 
Cucumber...... 10 Parsleyaasasecee 3 Watermelon. ... 6 


Find the average length of life of the seeds. 

2. Ernest Thompson Seton gives, in “ The Arctic Prairies,” the 
number of antelopes in 26 bands seen along the C. P. Railroad in 
Alberta, within a stretch of 70 miles, as follows: 8, 4, 7, 18, 3,9, 14, 1, 
6, 12, 2, 8, 10, 1, 3, 4, 6, 18, 4, 25, 4, 34, 6, 5, 16,4. Find the average 
number in a band. 

3. The distribution of ages of pupils in a certain publie school was 
as follows: 


Ages Frequencies 
12 7 
13 45 
14 186 
15 114 
16 61 
17 8 


Find the average age. 
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4, A certain college gives a credit of 4, 3, 2, 1 points for each A, 
B, C, D respectively, received in a course. The average credits 
received by seniors above 3.1 in one semester were as follows: 


Ores Lh rhea eee tees 0) CORO mont Io 10 mo. 50 Sed Ona mess e 
Frequencies.......... eel 8 8 ie i il ig 


What was the average credit received by this group? 

5. The attendance at a certain university increased 10, 15 and 18 
per cent in three successive years, respectively. What was the average 
(geometric) annual rate of increase? 

6. If the protein content of corn of a community increased 5, 8, 12, 
18 and 23 per cent in four successive years of breeding, what was the 
average (geometric) annual rate of increase? 

7. Find the values of the medians in Exercises 1-4. 


38. Frequency Distributions.—Attention was called in the 
preceding section to the possibility of two or more of a set of 
observations having the same value. It should be obvious 
that this possibility is due fundamentally to inaccuracy in 
expressing the observations and that if the measurements were 
sufficiently refined the expressed values of no two of them need 
be equal. Even then, however, a close examination would 
reveal the fact that the observations are by no means equally 
spaced but tend to concentrate at certain very important 
points. These points will be noted much more quickly and 
reliably if we sacrifice accuracy in expression to the extent of 
allowing the observations to fall into classes of equal intervals. 
The number of observations falling into any class will be called 
the frequency; the middle value of the possible measurements 
or observations of a class is called the class mark; and the com- 
plete series of pairs of class marks and corresponding frequencies, 
arranged in order of size of the class marks, is called a frequency 
distribution. The limiting measurements of a class are called 
the class limits. As an example, the 36 observations given in 
the preceding section may be classified to give any one of 
several possible frequency distributions; two of these possi- 


bilities are as follows: 
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No. 1 No. 2 
Observations Frequencies Observations Frequencies 
(class marks) (class marks) 

2.8 1 3.0 iL 
3.1 0) 3.5 5 
3.4 1 4.0 9 
3.7 6 4.5 11 
4.0 4 5.0 7 
4.3 9 5.5 3 
4.6 5 = 
4.9 6 36 
5.2 2 
5.5 1 
5.8 1 

36 


The proper interpretation of the various items is very 
important. Thus, the frequency “9” in the second distri- 
bution refers to 9 observations not all of size 4.0 but of sizes 
ranging between the two limits 3.75 and 4.25. The frequency 
“9” in the first distribution refers to observations of sizes 
ranging from 4.15 to 4.45. The class interval of the second 
distribution is then 0.5 and that of the first is 0.3. 

It is easily verified that the values of the arithmetic average 
of these distributions differ very little whether we use the iso- 
lated values as given originally or whether we use one of the 
frequency distributions given above. This fact holds fairly 
consistently for all distributions. We shall see later, moreover, 
that the labor of computing the value of the arithmetic average 
and of other expressions is greatly reduced by the use of fre- 
quency distributions. The formation and use of frequency 
distributions is further justified by the fact that in the ideal 
investigation, where a very large number of observations are 
made, the values of these observations fall naturally into classes 
from the very beginning. 

It is only natural to ask how far one is justified in carrying 
the process of compressing a set of observations into a frequency 
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distribution. It should be emphasized that there may be 
important reasons for retaining such observations in their 
original form in some cases, but otherwise the criterion usually 
applied is that the compression of the observations may be 
carried until a relatively smooth series of frequencies is obtained. 
Such a criterion affords no rigid method of application, and we 
simply claim that the process of forming frequency distributions 
is natural and—as we shall see—leads to a much simpler method 
of computing the mean and other expressions with only a slight 
loss in accuracy. 

39. Frequency Curves: Fitting Curves to Frequencies: 
The Mode.—If we plot the corresponding pairs of values of a 
frequency distribution, letting 2 refer to the observation or 
class mark and y to the corresponding frequency, and join the 
points thus obtained by a smooth curve, we obtain what is 
called a frequency curve. In many cases where the number of 
observations is small and where the discrepancies in the smooth- 
ness of the values of the frequencies are clearly due to this 
lack in number, it may be preferable to pass a smooth curve 
through as many points and as near the others as possible, in 
the hope of obtaining a representative curve which we shall 
refer to also as a frequency curve. The process of determining 
this representative curve, or of fitting a curve to the observed 
frequencies, is useful in smoothing or graduating these fre- 
quencies. Thus, if this representative curve were plotted 
carefully on ruled paper, the ordinates corresponding to the 
original observations could be easily read off to give a new and 
smoother distribution of frequencies, which is called a gradua- 
tion of the original frequencies. 

Other, and usually more satisfactory, methods of graduation 
will be considered later, which will consist essentially of deter- 
mining the analytic expression for the frequency curve which 
fits the observed frequencies. Such a procedure will not only 
usually lead to more satisfactory graduations but, what is more 
important, will also render available the many processes of 
mathematical analysis. The fitting of curves to observed 
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frequencies by the formal processes of analysis is one of the 
most useful problems of scientific work, because it usually 
replaces a large number of relatively intangible statistical data 
by a single representative expression of algebraic form which 
may be analyzed at great length. 

If a number of observations were arranged in the form of a 
frequency distribution, a mode is a value to which corresponds 
a greater frequency than to values just preceding or values 
immediately following it. A frequency distribution may then 
have more than one mode, although we shall confine our atten- 
tion almost wholly to frequency distributions which have only 
one mode. 

The great service of the mode is to characterize a type. 
Thus, when we say that a certain man is an average citizen we 
mean that he represents a type which is met oftener than any 
other; we certainly do not refer to an arithmetic average or a 
median. Thus, if a few citizens of a community are million- 
aires, and all the rest, to quite a number, are in poverty, we 
should say that the average citizen is in poverty, meaning by 
average citizen the “modal” citizen; an arithmetic average 
of the wealth of the community might give the erroneous 
impression that the people of the community are in good 
financial condition. The mode as an average suffers from the 
fact that it can not usually be determined with any great 
degree of accuracy—at least, from the average frequency dis- 
tribution. 

40. Rectangular Histograms: Histograms of Three Dimen- 
sions.—If, in plotting the points corresponding to the observa- 
tions and frequencies of a distribution, the ordinates were 
replaced by vertical rectangles of proportionate length and of 
width proportional to the class interval, where the midpoint 
of the base of a rectangle is taken at the corresponding class 
mark as an abscissa, the aggregate of rectangles—called fre- 
quency rectangles—is called a rectangular histogram. The 
histogram corresponding to frequency distribution No. 2 
Art. 88, would appear somewhat as follows: 


? 
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SONGS A045) 5:0) 5:5 
Fira. 1. 


If the ordinates of the graph of an analytic expression cor- 
responding to successive integral values of the abscissas were 
drawn, and then the corresponding histogram were drawn, 
the area of the histogram would be given by ordinary summa- 
tion or finite integration. Thus, the geometrical interpretation 
of finding the sum of a finite number of terms of a series is the 
determination of the area of the corresponding histogram. 
The area under the corresponding curve would evidently be 
given by ordinary integration. The area of the histogram 
would, however, be only approximately equal to the area under 
the corresponding curve. At least a slight discrepancy between 
the two areas is then to be expected even when the analytic 
expression is known. As this analytic expression is rarely 
known in practice, another source of discrepancy would lie in 
the choice of the analytic expression selected to fit the observed 
frequencies; that is, the analytic expression selected could 
rarely be expected to fit a given frequency distribution exactly. 
The relation between the various characteristics of a histogram 
and the corresponding frequency curve determined analytically 
forms the essential basis for the explanation of the discrep- 
ancies between results obtained directly from statistical data 
and results obtained by the formal processes of mathematical 
analysis. The effort should be made simply to control the 
values of such discrepancies sufficiently to obtain the desired 
accuracy in the final results. 

Frequency distributions considered heretofore are said, for 
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obvious reasons, to be of two dimensions. Frequency distri- 
butions of three dimensions are also common and are usually 
given in the following form, which gives the results of an investi- 
gation into the relation between the grades by psychological 
test and mathematical grades of 307 students: 


Psychological Test 


Mathe- 

matical 

Grades | 95} 105 | 115 | 125 |135 | 145 | 155 | 165 | 175 | 185 | 195 | Total 
pol peel hae 1 1) 3) 3st 71-9) 4 ee 
ed a a} 1141 71101 10) 8} °8 ) oa 
Fie itekd | al ee gs |16123|35|28|}22}15] 5 152 
65 dal eoh | hou ed Wee gedaan 43 
55 Wk ly Sule Gilat eee Oui 40 

Total....|1| 4 | 20] 30|36|57|57|52|33|15| 2| 307 


It will be noticed that the totals at the right and at the 
oottom constitute frequency distributions of mathematical 
grades and grades by psychological test respectively. 

If parallelopipeds were thought of as erected upon the xy 
plane of lengths proportional to the frequencies and correspond- 
ing to the class marks given in such a table, the aggregate of 
parallelopipeds would form a histogram of three dimensions; 
and if a surface were thought of as fitting the frequencies or 
the histogram somewhat as a frequency curve fits ordinates 
representing frequencies or a rectangular histogram, the sur- 
face is called a frequency surface. 

41. Deviations: A Method of Computing the Arithmetic 
Average.—We propose now to illustrate how the formation of 
frequency distributions may be employed to simplify the com- 
putation of the arithmetic average. It is desirable first, how- 
ever, to introduce also the idea of deviations. 

Suppose that we select a trial, or provisional, mean of a 
given distribution by inspection and denote it by M’. Then 
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the average d, of the deviations of all the observations of the 


distribution, say a1, %2,...2, from this trial mean, is obvi- 
ously 
can, 7 ae / Let , 
qa M")+ (a2 es eS hth 
or 
i a Omen iat neta mec ice Meinl hee: et au ee OG) 


where M denotes the true mean. Formula (36) then shows 
that if we work with the deviations from a trial mean instead 
of with the original observations the average so obtained, or d, 
constitutes a correction which when added algebraically to the 
trial mean will give us the true mean. A few examples will 
verify the fact that such a plan simplifies the computation 
because large factors will thereby be replaced by smaller 
factors. It is easily verified that the average of the deviations 
from the true mean must be zero. 

Computation of the mean will be further simplified if we 
replace the unit of measurement at first by unity and replace it 
later, in which case formula (86) may evidently be written 


NE Mist d's sles een (aT) 


where k is the unit of measurement and d’ is the average of the 
deviations from the trial mean when this unit of measurement 
is taken to be unity. The application of formula (87) is shown 
below. 


Original Frequencies Deviations 
Observations y x xy 

340) 1 —3 — 3 

G3 565) 5 —2 —10 

4.0 9 —1 — 9 

4.5 ha 0 

5.0 uf 1 if 

HD 3 2 6 
36 — 9 


The trial mean is M’=4.5; 
the unit of measurement is k=0.5; 
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and the unit correction d’=—.9®;=—0.25. 
Hence, the true mean is M=4.5+0.5(—0.25) =4.375 or 4.38. 


Many observations appear originally in the form of devia- 
tions, and such deviations constitute errors of observation so 
frequently that the term “errors” is used very widely as 
synonymous with the term ‘‘ deviations.”” The term “ error” 
then implies far more in such a connection than the term usually 
implies. 


EXERCISES 


1 and 2. Form frequency distributions of the data of Exercises 
1 and 2 of the preceding list and compute the arithmetic averages. 


(a) Form a frequency distribution of the data of Exer- 
cise 2 with class marks 3, 8, 13, 18, etc., and compute the 
arithmetic average. 


3. A set of examination books in geometry were subjected to two 
independent readings by the College Examination Board to give the 
following differences in credit (on a basis of 100) and corresponding 


frequencies: 
Differences........ 0 1 2 3 4 5 6 yf 8 9 
Frequencies. ...... 5 5 v7 9 5 14 2 8 6 5 
Differencess.....5., LO= Ll B12 93> 14 15 eS D0) ee oom 
Frequencies....... a bi + 2 1 3 2 2 1 1 
(a) Compute the average difference. Ans. 1.01; 
(6) Construct a frequency distribution with class marks 
1, 4, 7, 10, etc., and compute the arithmetic average. Ans. 7.00. 
4, Another example of the same kind as the preceding yielded: 
Differences........ 0 ul 2 3 f 5 6 vi 8 9 
Frequencies....... 6 5 22 12 Salis 2 8 0 3 
Differences........ KO a ee aly IE aly aI eas 
Frequencies....... 3 1 0 2 0 3 0 1 1 


Follow the instructions given in the preceding exercise. 


Ans. 4.76. 
4.62. 
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5. The average credit received by all students in one year at a 
certain college was as follows: 


Credit Frequency Credit Frequency 
4.05 12 1.65 290 
3.65 34 Ls 174 
3.25 96 0.85 81 
2.85 128 0.45 33 
2.45 244 0.05 9 
74 | 5 250 


Compute the average credit. What single credit (i.e., the mode) 
is most likely? 

6 and 7. The following frequency distributions give, first, the 
lengths of 800 ears of corn in inches, and second, the heights of the 
freshmen at a certain university. 


= ee Frequencies ae, Frequencies 

4.0 1 61 2 
4.5 il 62 10 
5.0 8 63 11 
pee 33 64 38 
6.0 70 65 57 
6.5 110 66 93 
7.0 176 67 106 
Gao 172 68 126 
8.0 124 69 109 
8.5 61 70 87 
).(0) 32 71 75 
@).8) 10 72 23 

74 4 


(a) Compute the average length (or height). 

(b) Form a frequency distribution with class marks 4.25, 
5.25, 6.25, etc. (61.5, 63.5, 65.5, etc.), and compute the average 
length (or height). 

(c) What length (or height) was most common? 
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8. The weights of the freshmen of a certain university were found 
to be as follows: 


Pounds Number Pounds Number 


104.5 21 154.5 87 
114.5 68 164.5 35 

‘ 124.5 169 174.5 14 
134.5 203 184.5 11 
144.5 142 


Compute the average weight. 

9. The following data are the partial results of an investigation 
into the cost of living in the District of Columbia for 1917, among 
representative families drawing salaries of $1800 or less (incomes 
given in the table which exceed $1800 refer to families which received 
remuneration in addition to salaries). 


Average Size Number of 


Incneae: at Famile Families 
$300 2.2 10 
500 $1 54 
700 3.5 156 
900 3.8 247 
1100 3.7 242 
1300 4.0 280 
1500 3.9 221 
1700 4.0 122 
1900 4.1 87 
2100 4.2 35 


Assign a trial mean, ignoring the unit of measurement until the 
last, and 


(a) Compute the average income. 
(b) Compute the average size of a family. 


(c) What size of family and what income were most likely? 


10. The following data give the ages at which infection of leprosy 


was “supposed” to have taken place among cases investigated in 
Hawaii. 
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Ages Frequency Ages Frequency 
1-5 8 46-50 48 
6-10 56 51-55 Al 

11-15 163 56-60 26 

16-20 204 61-65 18 

21-25 143 66-70 18 

26-30 114 71-75 3 

31-35 79 76-80 3 

36-40 89 81-85 1 

41-45 44 


Determine the average age of infection. 
At what single age was infection most likely? 


42. Normal Distributions.—It is needless to say that egre- 
gious blunders can upset any kind of an investigation and that 
we must take their absence for granted. Errors of smaller 
size are, however, inevitable and are of the greatest importance 
in the analysis of statistics. A vast majority of these errors 
are compensating in character and tend, in the long run, not 
only to hover or concentrate about the mean but also to occur 
with the same frequency corresponding to each size of devia- 
tion on one side of the mean as on the other. Such distribu- 
tions are called normal distributions and will be given much 
consideration in later sections. It will be desirable for the 
present to include as normal distributions many distributions 
which possess discrepancies which seem to be due to lack in the 
number of observations and which would promise to disappear 
if the number of observations were increased indefinitely. 
The graph of a truly normal distribution would then be sym- 
metrical with respect to an axis of reference erected at the mean. 

There will be found many frequency distributions which 
differ markedly from a normal distribution. However, the 
great majority of all the distributions will prove to be normal, 
according to the loose definition given above, when we have 
cogent reason for believing that deviations of any size on one 
side of the mean are as likely as deviations of the same size on 


the other side. 
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We propose to assume that distributions of errors of observa- 
tion would prove to be normal whenever we have cogent reason 
for the assumption, and we usually have cogent reason for the 
assumption when the observations refer to the size of a single 
object, such as the deviations from the most probable value of 
a large number of readings of a barometer made under the 
same conditions. It should, of course, be kept in mind in that 
connection that the actual distributions of errors of observa- 
tion will usually be entirely lacking and that the assumption 
stated above is the best at our disposal. 

Frequency distributions of measurements of various objects, 
even though these objects belong to the same particular class, 
may or may not be normal. Thus, the distribution of the 
heights of a large group of individuals all of the same age might 
be fairly normal, but the distribution of the individuals of a 
community with respect to wealth would be pretty sure to 
differ widely from a normal distribution. In neither of the last 
two illustrations would we have cogent reason for assuming 
the distributions to be normal. 

Even distributions of measurements made upon a single 
object differ occasionally from normal distributions for some 
particular reason which may not be suspected at first thought. 
Thus, if we set up a frequency distribution of the results of 
estimating the center of a book with the point of a knife blade, 
our knowledge of the inequality of the strength of the two eyes 
would probably affect any cogent reason for expecting a normal 
distribution. 

43. The Standard Deviation or Dispersion.—One variation 
from the method of computing an average, such as the arith- 
metic average or mean of a set of observations, would be to 
square each deviation from the mean, determine the average 
of the squares, and then extract the square root of this average. 
The final result is called by the expressive term root mean 
square, or more generally, the standard deviation or dispersion 
of the observations. As an example, the squares of the devia- 
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tions of the observations given at the beginning of this chapter 
from their mean 4.38 would be 


2.62 0.38 0.05 0.00 0.12 0.38 
LZ, 0.34 0.05 0.00 0.12 0.38 
0.49 0.21 0.05 0.00 0.14 0.49 
0.44 0.21 0.01 0.05 0.25 0.81 
0.44 0.09 0.00 0.07 0.34 1.04 
0.44 0.07 0.00 0.09 0.38 1.80 


It is easily verified that the average of these squares is 0.374 
and that the value of the standard deviation is then 0.61. 

Special attention is called to the fact that the largest devia- 
tions are given special emphasis in computing the value of the 
standard deviation, since they are squared before an average is 
taken, and numbers greater than unity are increased and 
numbers less than unity are diminished by the process of 
squaring. This is usually very desirable because the presence 
of large deviations 1s usually very important. The value of the 
standard deviation constitutes a good measure of the con- 
sistency of a set of observations or the extent to which the 
observations differ from the mean. Hence, we are usually 
very much interested in any inconsistencies which are apt to 
affect our confidence in the value of the mean. The method of 
computing the standard deviation obviates, then, any possi- 
bility of overlooking any such inconsistencies. As an example, 
suppose that two groups of individuals were to make readings 
of a finely graduated barometer. If the standard deviations 
of the two sets of observations should differ significantly, we 
would naturally place more reliance in the mean of the set 
whose standard deviation had the less value, because that set 
of observations would be regarded as more consistent. 
’ The standard deviation or dispersion will prove to be of 
great fundamental importance in almost all of the work of the 
succeeding chapters. It is well, then, that we look for a much 
easier method of computing it. Such a method, like that of 
computing the mean, is based upon the use of a frequency dis- 
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tribution of deviations from a trial mean. Here, again, we shall 
find that the compression of a set of observations into a fre- 
quency distribution, with a corresponding sacrifice in the 
accuracy of the expression of the individual observations, 
affects much less the accuracy of an average such as the stand- 
ard deviation. 

If we denote the square of the dispersion of a set of observa- 
tions 21, t2,...% about a trial mean MW’ by s? then 


ao Age MP X(t;-M+M—M’)?  X(x:—M-+d)? 
n n n 


Zier)" og) 


n n 


? 


where d denotes the correction to be applied to the trial mean 
M’ to give the true mean M. 
S) — j y : .)2 
But eo) . #) and oon ee 
of the deviations from the mean and the square of the dispersion 
which we desire, of which the value of the first is zero. If, 
then. we denote the dispersion about the mean by o we have 


are, respectively, the average 


2 = +2, 
or 
(38) 


It is left for the student to verify from the following example 
that if the unit of measurement is k and the standard deviation 
with this unit of measurement replaced by unity is o’ then 


e=ke'. 2S) See eee (39) 


The application of formulas (88) and (39) are shown below 
in connection with the very much compressed frequency dis- 
tribution given previously: 
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Observations Frequencies Deviations 


(class marks) y x xy ee) 
3.0 1 —3 — 3 9 
3.5 5 —2 —10 20 
4.0 9 —1 = 2 9 
4.5 ital 0 
5.0 7 1 7 ia 
5.5 3 2 6 12 

36 — 9 57 
Hence 
js =. OK 
36 ae 
57 
s?=—-=1.5 
36 ee 


o2=1.58~0.06=1.52, 
and eet 
g=ke =0.5V1.52=0. 76. 

It will be recalled that the value of the standard deviation 
obtained directly from the original observations was 0.61. 

44, Average Deviation.—We have already referred to the 
fact that it is characteristic of the standard deviation to give 
special emphasis to the larger deviations. Cases arise occasion- 
ally where this emphasis is unnecessary or even undesirable. 
Under such circumstances the average deviation usually proves 
useful and may be defined as the arithmetic average of the 
absolute values of the deviations from the mean. It is easily 
verified that the average deviation of the observations given at 
the beginning of this chapter is 0.49, and that the average devia- 
tion of the deviations of tae frequency distribution considered 
in the preceding section is also 0.49. It should be noted that 
this value is less than that found for the standard deviation, 
for the reason stated above. 


EXERCISES 
1 and 2. Compute the values of the standard deviation of the dis- 
tributions given in Exercises 3 and 4 of the preceding list. The 
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second distribution of that list was obtained after a somewhat more 
thorough examination than in the case of the first distribution. Do 
the values of the standard deviation reflect this fact? Show that the 
value of the ratio of the standard deviation and the mean is greater 
for the second distribution than for the first. Compute the average 
deviations. Ans. 4.25 and 3.05. 

3. Scores in whist, by evenings, were made by two players as 
follows: 


A: —51, 19,8, —23, 15, 38, 11, —1, 3, 35, —84, 34, —20, 42, 41, —33 
Bi. 5, +378 —20. 66, —84 16) 4, 15, 8s 
1747 


These scores are expressed in sevenths to avoid fractions. Take 
this fact into consideration and compute the actual average score and 
dispersion of each player. Which player has the better average and 
which appears to be more consistent? 

4. The scores of a famous cricketer for two years were as follows: 


1905: 27, 76, 14, 13, 47, 45, 39, 7, 15, 34, 38, 106, 107, 75, 8, 3, 4, 4, 4, 
47, 13, 209, 66, 14, 0, 78, 10, 23, 0, 0, 44, 16 

1906: 0, 10, 204, 0, 0, 60, 0, 85, 0, 49, 86, 3, 50, 5, 10, 7, 70, 43, 9, 59, 
97, 0, 166, 152, 13, 22, 34, 82, 14, 14, 35, 56, 48, 23, 31 


Compute the average and the dispersion for each year. 
Ans. 37.1, 43.1; 43.9, 49.3. 
Note that the value of the ratio of the dispersion and the mean 
was greater for 1905 than for 1906. 
5. Follow the instructions given in the preceding exercise but for 
the scores: 


1906: 9, 37, 24, 24, 58, 10, 22, 255, 13, 73, 1, 5, 47, 3, 110, 80, 0, 16, 
8, 21, 20, 3, 31, 70, 55, 26, 60, 68, 32, 143, 80, 5, 19, 1, 115, 82, 
70, 52, 6, 85, 6, 33, 31, 6, 59, 2, 66, 71, 63, 44, 8, 40, 122, 29, 
38, 76, 32, 17 

1907: 39, 82, 25, 5, 219, 135, 0, 46, 0, 68, 10, 31, 194, 244, 143, 195, 
54, 69, 6, 15, 100, 144, 34, 144, 208, 54, 34, 76, 10, 16, 10, 109, 
51, 18, 14, 76, 168, 17, 3, 94, 45, 51, 8, 9, 24, 60, 110, 141, 22, 
28, 62, 51, 17. Ans. 44.5, 43.9; 66.4, 62.0. 
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45. The Coefficient of Variation—In comparing the way 
two things vary it should be evident that relative size influences 
not only the mean but also the deviations from it. Stating 
the matter a little differently, if the mean is greater in one 
investigation than in another it is only natural to expect the 
deviations from the mean to be greater. A better measure of 
the variability of a character for purposes of comparison is 
usually given, therefore, by dividing the value of the standard 
deviation by the value of the mean. The coefficient of variation 
v is defined by the relation 


_ 1000 


7) VM” 


where M denotes the mean. 

The formula for the coefficient of variation does not fit 
very easily into the many formulas which will be considered 
in the following chapters, and it is rarely necessary in extended 
investigations, such as will be treated later, to subject the 
variability of characters to an analysis beyond that afforded by 
the value of the standard deviation alone. However, if the 
main object of an investigation is to measure the variability 
of a character, and accuracy is essential, then the value of the 
coefficient of variation should also be computed. 

As an example, let us compare the records of the famous 
English cricketer, Hayward, for the years 1905 and 1907, and 
determine in which year he was more consistent. The scores 
for these years are as follows: 


1905: 43, 27, 13, 9, 10, 34, 59, 53, 13, 116, 11, 4, 3, 31, 22, 35, 3, 24, 
21, 128, 168, 52, 122, 17, 148, 91, 88, 14, 203, 13, 64, 177, 88, 
76, 106, 112, 2, 25, 48, 64, 26, 24, 216, 81, 58, 14, 33, 32, 35, 53, 
10, 9, 197, 12, 28, 28, 0, 44, 2. 

1907: 39, 82, 25, 5, 219, 135, 0, 46, 0, 68, 10, 31, 194, 244, 143, 125, 
54, 69, 6, 15, 100, 144, 34, 144, 208, 54, 34, 76, 10, 16, 10, 109, 
51, 18, 14, 76, 168, 17, 3, 94, 45, 51, 8, 9, 24, 60, 110, 141, 22, 
28, 62, 51, 17. 
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The following results are obtained: 


Average Standard Coefficient of 
Score Deviation Variation 
1905 54.9 55.0 100 
1907 66.4 62.0 93 


It follows, then, that although the value of the standard 
deviation of the scores was greater for 1907 than for 1905, we 
should be unjustified in this case to conclude that Hayward was 
more consistent in 1905, for his average was so much greater 
in 1907 that the value of the coefficient of variation was less 
for that year than for 1905. 


EXERCISES 


1. The scores in whist made by seven players, by evenings, are as 
follows: 


A B C D E F G 
13 12 18 —37 — 9 == *!) 12 
—35 65 —11 8 —17 —27 65 
10 16 = KZ 18 6 16 
= — 8 23 —1i1 ll i aC 
48 48 37 —19 —1 —19 17 
7 7 3 8 3 —28 —20 
41 8 Le 7 —16 7 ily 
5 —18 0 —14 5 21 5 
4 +4 —32 —29 37 —34 —32 
34 8 8 1 —16 11 34 
22 —15 —15 2¢ —14 23 22 
13 15 15 —4 0 —33 13 
29 —28 16 —23 1 —12 11 
52 21 52 = 9 —11 — 2 35 
—14 11 —25 41 11 —13 —14 
ea —36 14 — 5 —17 — § 15 


These scores are expressed in fifths to avoid fractions. Take this 
fact into consideration and compute the actual average score and dis- 
persion of any two players. Note not only which player has the 
better average, but also whether the ranking of the players in regard 
to consistency is affected by the size of the average score. 
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2. The exports of the United States and of Great Britain, in 
millions of dollars, for the first eleven months of 1921 were as follows: 


United States Great Britain 


eT LINULALTAV eee eee 279 302 
Helorulsi yee ene 251 298 
IMENT ROM 8 orca oma 8 © 330 327 
Nye Ss hte a al nee eee 318 285 
1 Toe tote pe a 308 298 
VUNG Syphon ate 335 271 
Ul yee eA 301 305 
ATIUSH SSE etre 302 301 
September......... 313 305 
Octobercey mer mer 371 305 
November......... 383 339 


Compute and compare the values of the dispersions and coeffi- 
cients of variation. 

3. Two hundred estimates of the center of a book (a different book 
was used in each case and the ends of the book were reversed after 
each estimate) were made by each of four individuals to give the 
following distributions: 


A’s B’s C’s D’s 
Page Fre- Page Fre- Page Fre- Page Fre- 
quencies quencies quencies quencies 

480 +4 545 1 540 2 615 1 
485 21 550 il 550 10 625 1 
490 8 555 4 560 10 635 10 
495 15 560 3 570 24 645 25 
500 19 565 7 580 23 655 39 
505 33 570 18 590 28 665 46 
510 23 575 ZL 600 34 675 32 
515 17 580 17 610 25 685 27 
520 12 585 28 620 21 695 10 
525 19 590 22 630 13 705 7 
530 13 595 31 640 9 715 2 
535 il 600 14 650 1 
540 1 605 13 
545 2 610 11 
550 1 615 3 
555 1 620 4 

625 1 

630 1 
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Compute the coefficients of variation and rank the estimators in 
the order of their consistency. v 
Ans. D, 0.27. 
C, 0.40. 
B, 0.51. 
A, 0.62. 


4. The scores of two cricketers for three seasons were as follows: 


A: 0, 0, 4, 20, 61, 26, 8, 
206, 48, 43, 0, 1, 1, 


10, a i 52, 5, 27, 40, 12, 2, 11, 34 
3. 7, 27, 77, 17, 43, 42, 24, 20, 23, 10, 68, 
170, 58, 47, 77; 28, 1, 6 0, 38, 4, 48, 0, 0, 1, 25, 1, 17, 20, 11, 
14, 89, 234, 5, 23, 0, 1, 52, 4, 2 A 4, 19, 15, 54, 21, 65, 57, 60, 
8, 10, 15, 11, 1, 45, 0,7: 18, 25, 0, 43, 0, 1; 63, 55, 5, 17, 16, 
42, 8. 8, 87, 27, 22, as 0, 66, 40, 0, 15, 6, 12, 111, 
0, 4, 19, 53, 4, 6, 0, 48, 15, 75, 26, 28, 1, 10, 34, 2, 14, 9, 54. 
B: 27 ie ea a 45, 39, 7, 15, 34, 38, 106, 107, 75, 8, 3, 4, 
4, a 13, 209, 66, 14, 0, 78, 10, 23, 0, 0, 44, 16; 0, 10, 204, 
, 60, 0, 85, 0, 49, 86, 3. = 5, 1 , 7, 70, 43, 9, 59, 97, 0 
fe ; 152, 13, 22, 34, 82, 14, 14, 35, 56, 48, 23, 31; 61, 42, 137, 
10, 41, 72, 0, 22, 53, 66, 4, 73, 28, 17, 16, 122, 48 5, 87, 26, 
9, 66, 7, 9, 17, 44, 60, 12, 66, 2, 77 


‘y 
4% 


nS 


Ss 
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Compute the values of the coefficients of variation and show that 
although B’s standard deviation is larger his coefficient of variation is 
smaller. A’s v= 124, 


B’s v= 105. 


46. The Graphical Interpretation of the Mean: the Simple 
Moment.—Although the use of the arithmetic average or mean 
is famihar to all, its graphical interpretation is much more 
subtle than the graphical interpretations of the mode or 
median. ‘The graphical interpretation of the arithmetic 
average of a distribution can be given most briefly as the 
abscissa of the centroid (or center of gravity) of the total 
area of the corresponding rectangular histogram, but if we 
look up a definition of ‘‘ centroid ” we find that ‘ the centroid 
of a mass is the point such that the moment of the mass with 
respect to any plane is the same as if the whole mass were 
concentrated at that point.’ This definition holds as well 
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for plane areas, in which case we need merely to change the 
word “plane”’ in the definition to “line.” Moreover, the 
word “mass ’’ may be construed to refer to area, ordinate, 
force, etc. We finally see that a satisfactory appreciation 
of the definition of a centroid requires a satisfactory apprecia- 
tion of the term “moment.” We shall therefore conclude 
this chapter with a brief discussion of si¢mple moments! and 
some applications. While it is unlikely that such a treatment 
will give a clear conception of the geometrical meaning of a 
moment (or a centroid) it should help to remove much of the 
subtlety of such a term by showing that the fundamental 
principle is intuitively known to overy one and is in everyday 
use throughout the world. The discussion should also give 
some appreciation of the fundamental value of the principle. 

Since the magnitude and direction of a force can be repre- 
sented by a straight line of corresponding length, we shall 
define the moment of a force about a given point as the product 
of the force and the perpendicular drawn from the given point 
upon the line of action of the force. 


F 


Thus, the moment of a force / about a given point O is 
F-OP, where OP is the perpendicular drawn from O to the 
line of action of F. If the irregular closed curve in the figure 


1 Moments of higher order will be treated in the next chapter. 
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represents the outline of a body of very small thickness, the 
physical effect of the force F acting upon the body would be to 
cause it to turn about the point O as a center. Hence, the 
product F-OP would seem to be a fitting measure of the 
tendency of F to turn the body about O. The directions of 
forces acting about such a point would be distinguished by 
positive and negative signs. 

Fundamental Principle——If two or more such forces are so 
acting upon such a body and yet the body remains in equilib- 
rium, it must follow that the algebraic sum of the moments of 
the forces about the assumed point is zero. 

A special form of this principle can be used to find the 
effect, in a specified direction, of a given force acting in another 
direction. The effect or force to be found is called the com- 
ponent of the given force in the specified direction and is given 
by the projection of the line representing the given force in 
magnitude and direction, upon the line of direction of the 
component, and is therefore equal to the given force multiplied 
by the cosine of the included angle. 

These principles, together with the definition of a centroid 
given above, will now be used to solve a few exercises. 


Ex. 1.—How much of a force is required at one end of a lever of 
length 10 feet to lift a weight of 100 pounds at the other end, if the 


A C B 
Ww 


fulcrum is located 2 feet from the weight end? Neglect the weight of 
the lever. 
According to the fundamental principle given above 


P-AC—W -.BC=0 
or 


and 
P=25 pounds. 
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Ex. 2.—A rod 8 feet long, supported by two vertical strings 
attached to its ends, has weights of 4 and 8 pounds hung from the 
rod at distances of 2 and 5 feet from one end. If the weight of the 
rod is 2 pounds, what are the tensions of the strings? 

Assuming the weight of the rod to act at its center, the following 
figure illustrates the conditions of the problem. 

R S 
B ic D 

A E 
4 2 8 

Since the rod is in equilibrium, the algebraic sum of the moments 
about A must be zero. 

Therefore, 

4X2+2xX4+8x5—SX8=0 
or 
S=7, 

Equating, similarly, the algebraic sum of the moments about EF 

to zero, 
4xX6+2X4+8xX3—RKX8=0 
we find that R also is 7. 

Ex. 3.—Forces equal to 3P, 7P and 5P act along the sides of an 
equilateral triangle as indicated in the following figure. Find the 
magnitude, direction and line of action of the resultant. 


See ————— 


Ne 
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Let the side of the triangle be s and let us assume that the line of 
action of the resultant will cut the line BC somewhere, say, Q, and the 
line AB somewhere, say, R. Hence, the algebraic sum of the mo- 
ments about Q must vanish, or 


3P(QC+s) sin 60°—5P XQC sin 60°=0 
or 
5s 
Similarly, from the algebraic sum of the moments about R, we 
obtain 


5s 
RB=. 

Hence, the triangle RBQ is isosceles and the angle at R is 30°. 
Therefore, the line of action of the resultant passes through the points 
R and Q, just located, and makes an angle of 30° with line RB. 

The algebraic sum of the components of the three forces in the 
direction perpendicular to BC is 


5P sin 60°—3P sin 60°= PV3. 


Similarly, the algebraic sum of the components of the three forces 
in the direction BC is found to be 3P. The resultant of two given 
forces is represented by the diagonal of the parallelogram whose sides 
represent the two forces; hence, the magnitude of the resultant is 
P19. 

Ex. 4.—Find the centroid of particles of masses of 2, 4 and 10 
units at the points (—10, 5), (6, 8) and (—2, —1), respectively. 

Let the coordinates of the centroid be x and y. Then, according 
to the definition of a centroid, 


Mx==mex 
and 
My==my, 


where m denotes the mass of a particle and M the total of all the masses. 
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Hence, 
16x = (—10)2+4x6+10(—2) 
or 
xr=-—1 
Similarly 
y=2. 
EXERCISES 


1. A rod, 5 feet long, supported by two vertical strings attached 
to its ends, has weights of 4, 6, 8 and 10 pounds hung from the rod at 
distances of 1, 2, 3 and 4 feet from one end. If the weight of the rod 
is 2 pounds, what are the tensions of the strings? 

Ans. R=13; S=173 

2. A uniform rod, 4 feet in length and weighing 2 pounds, turns 
freely about a point distant 1 foot from one end, and from that end a 
weight of 10 pounds is suspended. What weight must be placed at 
the other end to produce equilibrium? 

3. A uniform beam is of length 12 feet and weight 50 pounds and 
from its ends are suspended masses of 6 and 12 pounds respectively. 
At what point must the beam be supported so that it may remain in 
equilibrium? 

4. Forces equal to P, 2P, 3P and 4P act along the sides of a square 
taken in order (i.e., AB, BC, etc.); find the magnitude, direction and 
line of action of the resultant. 


ws 
Ans. Magnitude =2PV2, parallel to AC and distant from it BAe 


times the length of the side of the square. 

5. Show that the centroid of two particles divides the line joining 
them into segments inversely proportional to the masses. 

6. Show that the centroid of three equal particles lies at the inter- 
section of the medians of the triangle having the three points as 
vertices. 
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7. Find the centroid of equal particles at the points (0, 0), (4, 2), 
(8, —5) and (—2, —8). 
8. Find the centroid of equal particles placed at five of the six 
vertices of a regular hexagon. 
9. Find the centroid of the cross-section of an angle-iron, the sides 
being 5 inches and 8 inches, and the thickness of each flange 1 inch. 
Ans. (17/6, 4/38). 


CHAPTER VII 
MOMENTS 


47. Moments of Frequency Distributions and Curves.— 
We have noticed that when a large number of careful observa- 
tions are made, and when these observations are classified to 
form a frequency distribution, the frequencies are usually dis- 
tributed in such a smooth manner that frequency curves are 
naturally suggested. These frequency curves have been 
studied and investigated at great length in various ways, but 
in particular by what is called the method of moments. 
Although moments are employed in various branches of applied 
science, it will be found desirable to develop the method along 
special lines in order that it may be applicable to the discrete 
form of statistical data. 

We shall define the n-th moment of a point (x, y) with respect 
to the y-axis as the product of the ordinate y and the n-th 
power of the abscissa xz. We may then extend this definition 
to certain strips of area by the relation 


lim 5 
n-th moment=, 4 = DAA Re 20) 


where AA is an element or strip of the given area parallel to the 
y-axis whose distance from the y-axis is x, and where the sum- 
mation is to extend over the entire area concerned. It follows, 
then, that if the area is bounded by a curve whose equation is 
y=f(zx), the ordinates at e=a and x=b, and the z-axis, relation 
(40) may be replaced by the definite integral 


b 
n-th moment= f ake nen 5 ae 8 CIOS) 
If the area under consideration is that of a rectangular 
histogram, relation (40) may be written in the more suitable form 
6 
n-th moment= Davy... . .- . (407) 


r=a 
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which naturally suggests the use of finite integration. Finite 
integration is singularly appropriate when no simpler method 
is available; but so many valuable and instructive applications 
can be made where ordinary multiplication and addition will 
suffice that we shall confine our attention for the most part to 
those applications. Certain applications which call for the 
use of the integral calculus, and which will lead to some of the 
most important conceptions treated in this course, will be sug- 
gested in exercises and in the theory considered in later sections. 
The following small distribution gives the frequencies of litters 
of the corresponding number of mice as found in certain breeding 
experiments. The typical method of computing the first and 
second moments is shown to the right. The higher moments 
are computed in a similar manner. 


Number in Litter Frequency 


x y ry xy (x+1)*y 

il Ve 7 7 28 

2 al 22 44 99 

3 16 48 144 256 

4 il 68 272 425 

5 26 130 650 936 

6 31 186 1116 1519 

7 11 Ge 539 704 

8 1 8 64 81 

9 1 9 81 100 

121 555 2917 4148 
Thus, the zero-th moment or total frequency = 121, 
the first moment = 555, 
and the second moment = 2917. 


To the right is shown a method of checking the computation 
of the first and second moments, called the Charlier method of 
check. Since (w+1)?y=22y+2ry+y, it is evident that the 
value of =(#+1)?y must be the same as the sum of the second 
moment, twice the first moment and the zero-th moment. 


In this case 
4148 = 2917+ 2(555) +121. 
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EXERCISES 


Compute the values of the first and the second moments of the 
following distributions: 


1. The frequencies of the various numbers of certain glands found 
in the right forelegs of 2000 female swine were found to be as follows: 


Number 
0 


or WN 


Frequency 
15 
209 
365 
482 
414 
277 


What is the value of the mean? 
2 and 3. The frequencies of the various numbers of children per 
wife were compiled from certain genealogical records of American 
families for several periods, of which the following distribution (2) 
refers to the period preceding 1700 and the distribution (3) to the 


period 1870-1879: 


Number of 


Children 


OONOOFrWNrF OC 


(2) 


Number 


Frequency 


(3) 
88 


2000 


Frequency Frequency 
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What is the average number of children per wife? 
4. The budgets of 421 Smith College girls for 1914-1915 were 
found to be: 


$450 66 
650 169 
850 109 

1050 43 

1250 20 

1450 8 

1650 3 

1850 3 

421 


What is the average budget? 


48. Fitting Curves by Moments.—We have already men- 
tioned the frequent desirability of fitting curves to given fre- 
quencies, that is, of determining the analytic expression that 
gives computed frequencies which agree very closely with the 
given frequencies. We have mentioned also the peculiar ease 
with which rational integral functions may be’ used for this 
purpose. Moments are very useful in fitting rational integral 
functions, or polynomials of the form y=a+bax+cx?+ etc., to 
given frequencies. 

In fitting such curves it is well to decide beforehand, from 
the appearance of the frequency curve obtained by plotting 
the given frequencies, what degree of the polynomial seems 
most appropriate with respect to both accuracy of fit and sim- 
plicity of the polynomial desired. It should be kept in mind 
that it is always possible to find the equation of a polynomial of 
the n-th degree which is satisfied exactly by the coordinates of 
any n+1 points. Such a curve, however, may fit the situation 
less well than a simpler curve, because many, if not most, of 
the vagaries of the positions of the points may be due to errors 
in observation. In the problem before us, it is not to be 
expected that the curve finally obtained will fit exactly all the 
points corresponding to the various frequencies, but merely 
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that the curve will fit them as well or better than any other 
curve of the same kind and degree. 

The fundamental line of procedure is to equate moments of 
the theoretical curve (represented by the polynomial selected) 
to the corresponding moments of the given frequencies and 
thereby set up equations whose solution will give the values of 
the coefficients of the polynomial. The method can be explained 
best by a simple example. Let us fit the straight line y=a-+-be 
(that is, determine the values of a and b) to the points (1, 2), 
(8, 9) and (5, 14). It is obvious on plotting that no straight 
line will pass exactly through the three points. 

The zero-th moment of the polynomial is the sum of the 
ordinates or values of y for c<=1, 3 and 5 or 


(a+b) +(a+3b) + (a+5b) =3a+9b. 


The zero-th moment of the given ordinates is likewise the 
sum of the ordinates or 
2+9+14=25. 
Equating, 
3a+9b =25, 


which is one of the two equations (since there are two unknowns) 


desired. 
It is left for the student to determine the first moments of 
the polynomial and of the given values, and show that 


(a+b) +3(a+3b) +5(a+ 5b) =2+3-9+5-14, 


or 
9a+35b=99. 
Solving the two equations simultaneously, we obtain 
a=—2 and b=3 and the desired equation becomes 
y=—Z ae 


To appreciate the result, the three points and the line just 
obtained should be plotted. If we substitute 1, 3 and 5 in the 
equation just found, we obtain y=23, 83 and 14} respectively. 
These values of y are called graduated values of 2,9 and 14. 
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The graduation of a set of observations or values evidently 
depends very vitally upon the selection of the function to which 
they are graduated. 


EXERCISES 


In selecting the form of a polynonial to be fitted to a set of data, 
the data should be plotted as a check upon the nature of the curve to 
be fitted; the successive differences should also be computed, and the 
order of differences which proves most nearly to be constant should 
decide what degree the polynomial to be fitted should be assumed to 
have. 

1. The pressure (P) of water in pounds per square inch at different 
depths (D), in units of 10 feet, was found by experiment to be as 
follows: 


P, 8.66 17.82 25.99 34.65 43.31 
D, 2 4 6 8 10 
P, 51.98 60.64 69.31 77.97 86.63 
D, 12 14 16 18 20 


Determine a relation of the form P=aD-+b. 
2. In experiments to determine the effort necessary to raise differ- 
ent loads with a crane, the following values were obtained: 


E, 60 70 79.5 89.5 99 108.5 118.25 128 138 147.5 
R, 1080 12838 1483 1683 1883 2083 2283 2483 2683 2883 


Determine a relation between the load R and the effort FE of the 
form R=aHk-+b. 
3. Same as Problem (2) but using the following data: 


HE, 14.2 26.6 38.1 50 59.1 72 81.8 91.5 
R, 28 56 4 112 140 168 196 124 


ioe) 


4, In an experiment with a Weston differential pulley block, the 
effort Z in pounds required to raise a load W, in units of 10 pounds, 
was found to be as follows: 


Ww, 1 2 ee, Sl 7 8 9 10 
E, 3.25 4.875 6.25 7.50 9 10.50 12.25 13.75 15 16.50 


Determine the relation H=aW-+b. 


The data of Eivercines 4-6, 8 10-1 5 are from. ieanvon and Lovitt’s ‘ i 
‘ ‘ ses 4—6, 8, 5 are ven) Q ovitt’s ‘‘ Mathematics for 
Collegiate Students of Agriculture and General Science.” : 
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5. The readings of a standard gas-meter S, and those of a meter 
T’ being tested on the same line, were found to be: 


S, 3000 3510 4022 4533 
ip 0 500 1000 1500 


Determine the relation S=aT+b. What are the meanings of a 
and b? 

6. The following observations were made: where y denotes the 
melting point (C.) of an alloy of lead and zinc containing x per cent of 
lead. 

xz, 40 50 60 70 80 90 
y, 186 205 226 250 276 304 


Determine the relation y=a+bx+czx?. Suggestion: Express x 
in terms of units of 10’s. 

7. The distance S in feet passed over by a falling body in t seconds 
was found by experiment to be: 


S 5 16 35 65 
i, 0) 5 1 1.5 2 


? 


Determine the relation S=at?. Suggestion: Fit the line S=au 
where u=??. Ans. S=16.1¢. 
8. Same as Problem (7) but using the data: 


& Bi 96 Bis ah a 
é, 5 1 1.5 2 2.5 2 


9. The following data give the velocities of water in the Mississippi 
river at various depths a for the point of observation chosen, the total 
depth being taken as unity. 


y, 3.1950 3.2299 3.2532 3.2611 3.2516 
wv. 


oo!) it P 3 4 
y, 3.2282 3.1807 3.1266 3.0594 2.9759 
at, .d° 6 a 8 9 


Ascertain by differences the appropriate degree for the polynomial 
y=a+be+ex?+ ...and determine the coefficients. 
10. The pressure p measured in centimeters of mercury, and the 


116 MOMENTS 


volume v, measured in cubic centimeters of a gas kept at a constant 
temperature, were found to be as follows: 


v, 145 155 165 178 191 
p, 117.2 109.4 102.4 95 88.6 


Determine a relation of the form pu=k. Suggestion: Take the 
logarithm of both sides of the equation. (Express the data in loga- 
rithms.) 

11. The amount of water A in cubic feet that will flow per minute 
through 100 feet of pipe of diameter d in inches, with an initial pres- 
sure of 50 pounds per square inch, was found to be as follows: 


tana 1.5 2 3 4 6 
A, 4.88 13.43 27.50 75.13 152.51 409.54 


Find the relation A=kd". (Use logarithms). Ans. A=4.88d?47%. 


12. In testing a gas engine, corresponding values of the pressure p 
measured in lbs. per sq. ft. and the volume v in cubic ft. were obtained 
as follows: 

v, 7.14 t.%3 
p, 54.6 50.7 45.9 


Determine the relation p=kv". Ans. p=887.6v~*°8, 
13. Same as Problem (12) but using the data: 
», 6.27 5.34 3.15 
p, 20.54 25.79 54.25 
Ans. pu'4!=273.5. 
14. Given the age and height in feet of a tree, as follows: 


247 


Agev, 18 34.4 50.5 218 
72.5 73 


z, 18.4 27.5 38.4 
Determine the relation v=k2". 


15. The specific gravity y of dilute sulphuric acid at different 
concentrations « per cent is given as follows: 


a, 5 10 15 20 25 30 35 
y, 1.083 1.068 1.101 1.189 1.178 1.218 1.257 


Determine an appropriate relation. 
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49. Unit Moments: Quadrature Formulas.—It should be 
obvious that any particular moment of a distribution as here- 
tofore defined may be made to assume any value we please by 
changing all the frequencies appropriately and proportionally; 
that is, the value of a particular moment depends upon the 
value of the total frequency. Unless, then, some standard 
for the total frequency or area is established, the values of the 
moments can not be controlled in a satisfactory manner. For 
this reason, a standard area or frequency is assumed. The 
most useful area to be taken as the standard is unity, and the 
moments computed on such a basis are called wnit moments. 
The values of the unit moments could be obtained in the precise 
manner illustrated in the preceding section if each of the fre- 
quencies were first divided by the total frequency, but the 
same result can be obtained much more easily if the moments 
are first computed in the manner illustrated and then divided 
by the total frequency. It should be evident on a little investi- 
gation that the two plans must give identical results. If we 
let the symbols »’, m’, »’, etc., refer to the zero-th, first, 
second, etc., unit moments, then, for the distribution of litters 
of mice given above 


ye — 

w= tar> 1, 
if 

y= 855= 4.59, 
y 

yy = 2917 = 241, 


Particular attention is called to the fact that the first unit 
moment or 
7 AAD) 
Ne 


where f(x) represents the frequency of the observation x and 

N is the total frequency, is the arithmetic average. The 

zero-th unit moment invariably has the value unity. Why? 
As the only moments—with a few exceptions—to which we 
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shall refer in the future will be unit moments, we shall refer to 
them from now on by the single term “ moment.” 

Just as the area of a rectangular histogram (or the sum of the 
ordinates) can rarely be expected to be more than an approxi- 
mation of the area under the corresponding frequency curve, so 
the values of the moments (which include the area as a special 
case) of a rectangular histogram or frequency distribution can 
be expected to be no more than approximations of the cor- 
responding moments of the corresponding frequency curve. 
It is highly desirable, therefore, to distinguish the moments of a 
curve by a suitable notation. Accordingly, we shall refer to the 
moments of a curve by the symbols mw’, m1’, uw, ete.; the nota- 
tion for the corresponding moments of a frequency distribution 
(or w’, 1’, »’, etc.) has already been given. 

It is scarcely necessary to say that the values of the moments 
of a curve can be computed accurately only by means of the 
ordinary calculus, and that when we equate the corresponding 
moments of a curve and of a frequency distribution a discrep- 
ancy is thereby introduced. In most of the applications which 
we shall consider these discrepancies will not prove serious, but 
in applications where considerable refinement and accuracy are 
essential it would be practically necessary to apply certain cor- 
rections to the moments of the frequency distribution by means 
of formulas known as quadrature formulas. There are many 
forms of these quadrature formulas, but all are obtained on the 
assumption that the given frequencies have been fitted by a 
suitable curve; the quadrature formulas are then no more than 
formulas for the moments of these curves, or corrections to be 
applied to the moments (called the rough moments) of the given 
frequencies, expressed so that they can be readily adapted to 
any situation. No treatment of the method of moments as 
applied to statistical data can be regarded as complete without 
a complete treatment of quadrature formulas, but a complete 
treatment of such formulas would carry us so far from the main 
purposes of this course that it must be omitted. Nothing less 
than an intensive treatment would give the student a satis« 
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factory working knowledge of such formulas. Moreover, the 
choice of a suitable quadrature formula is very frequently open 
to controversy even among those who have considerable experi- 
ence with work of this kind. It should be kept constantly in 
mind, however, from now on, that the moments of a frequency 
distribution are only approximations of the corresponding 
moments of the corresponding frequency curve. 

50. Moments About the Mean.—We defined moments 
criginally with respect to the y-axis; but it is obvious that the 
value of a moment will depend not only upon the value of the 
total frequency but also upon the position of the y-axis or the 
axis of reference. As the translation of the axes is a common 
procedure in mathematical analysis, it is only natural for pur- 
poses of clearness and definite understanding to establish a 
standard position for this axis of reference. The standard 
position of the axis of reference is taken at the mean, which is 
readily determined by computing the first moment. Reference 
to the method, given previously, of computing the correction 
to a provisional or trial mean shows that the value of the first 
moment about the mean is zero and that the mean is graphically 
the abscissa corresponding to the ordinate which bisects the 
total area, and, hence, which passes through the center of gravity 
of the area. The center of gravity is usually referred to in 
moments as the centroid, and the vertical axis of reference 
which passes through the centroid as the centroid vertical. 

If we use the symbols adopted previously, but without 
primes, to refer to the moments about the mean, then, obviously 


and 


It is easy to determine the values of the moments about the 
mean from the values of the moments about any other axis of 
reference, by means of the lateral transformation used so fre- 
quently in elementary analysis. 
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Let the distance between any axis of reference A about 
which the moments are known and the ordinate B at the mean 
be d. Then, if the distance between any ordinate y and A is 2’ 
and between y and B is x 

x’ =x+d, 


and 
(2')"= (w+d)". 
Hence, the n-th moment about the ordinate A is 


bn = a ae — y (where N refers to the total area) 


= y(2e'ytndze" lye es 1) eran 2ye+ ete.) 


n(n—1) ,, 
= dn tndun—y+ ( > bya ete., 
or transposing 
—1),, 
Ln = Ln’ ~ nny — oe etc. er) et ae (41) 


It should be noted that the formula just derived is an 
accumulative form; that is, moments about the mean are 
employed to find the next higher moment about the mean. It 
should be noticed also that the symbol » would really be 
appropriate only if the summations considered above were 
performed by integration; since, however, formula (41) holds 
for either of the symbols u or », the essential thing is to remember 
the distinction in the use of the symbols. 
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Substituting n=0 and n=1 in formula (41) we obtain 
Ho= Mo (=1), 


M = my —duy =p, —d. 


But since 
pa=0, 
pa —d=0, 
or 
d = Wie . . . . . . . . . (42) 


The latter relation shows that the first moment about any 
ordinate (i.e., the arithmetic average of the deviations) is the 
directed distance from that ordinate to the mean. (Compare 
this fact with the method of computing the correction to be 
applied to the provisional mean considered in connection with 
the arithmetic average.) For example, suppose that the first 
moment of the frequency distribution of litters of mice were 
computed about the provisional mean x=5 as follows: 


y x ry 
1 7 —4 —28 
2 ii —3 —33 
3 16 —2 —32 
4 17. =] ave 
5 26 0 
6 31 1 31 
Ti 11 2 22, 
8 1 3 3 
9 1 4 4 
ie 60 
2-110 
— 50 
Therefore i 
. f 
V1 py 0.41. 


The mean is therefore the provisional mean plus the value 
of »;’ or 5—0.41=4.59, as found previously. 
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It is left for the student to obtain the following relations 
from formula (41): 
p2=p,'—d?. Compare with (38). . . . (43) 
ja= ne ~Soda— a. 2 « Bae eee 


etc. 


EXERCISES 


1-4. Compute the values of the first and second unit moments 
about the mean of the distributions given in the exercises following 
Art. 47. 

5. The number of anal fin-rays was found, for 1000 minnows, to be: 


Number Frequency 
13 5 
12 144 
11 554 
10 279 
9 ity 
8 2 
7 1 


Compute the first and second moments about the mean. 


6. The weights of a large group of British males (adults) were 
found to be: 


Let Frequencies be Frequencies 
90 2 190 263 
100 26 200 107 
110 133 210 85 
120 338 220 al 
130 694 230 16 
140 1240 240 11 
150 1075 250 8 
160 881 260 1 
170 492 270 0 
180 304 280 1 
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Compute the first and second moments about the mean. What 
is the mean? 


7. Seed capsules of Shirley Poppies were found to have numbers of 
stigmatic rays in accordance with the following distribution: 


Number Number Number Number 


of of of of 
Rays Capsules Rays Capsules 
6 3 14 302 
vf 11 15 234 
8 38 16 128 
9 106 17 50 
10 152 18 19 
11 238 19 3 
12 305 20 1 
13 315 


Compute the first and second moments about the mean. What 
is the mean? 

8. The head-breadths of 1000 students at Cambridge University 
were measured, to the nearest one-tenth of an inch, to give the follow- 
ing distribution: 


Het Frequencies Hose Frequencies 
breadth ~ *duencte breadth ul 
5.5 3 6.2 142 
5 12 6.3 29 
Hi 43 6.4 37 
5.8 80 6.5 15 
5.9 131 6.6 12 
6.0 236 6.7 3 
6.1 185 6.8 2 
1000 


Compute the first and second moments about the mean. 


51. The Standard Deviation.—The standard deviation 
has already been defined and is also evidently the square root 
of the second (unit) moment about the mean, or 


gaVineVi—@®. 2. 45) 
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In the case of the frequency distribution of litters of mice, 
o = V 24.1— (4.59)? = 1.75 approximately. 


It is left for the student to obtain the same value by taking 
the trial mean at «=4 in the same distribution. 

It is well to refer again to other names for the standard 
deviation, namely: dispersion, root-mean-square, Mean error, 
etc. It is well also to refer again, for purposes of emphasis, to 
the great importance of the standard deviation in weighing the 
relative consistencies of sets of deviations. This use of the 
standard deviation or dispersion will form the basis of practi- 
cally all of the work of the succeeding chapters. 


EXERCISES 


Compute the value of the standard deviation of the distributions 
given in the preceding list of exercises. Remember to take the original 
unit of measurement into consideration. 


52. Computation of Moments by Summation.—The method 
of computing moments given in the preceding pages is the 
direct method and would suffice where little of that work is 
required, although it offers no systematic method of check 
upon the numerical work. If one expects to do a great amount 
of such computation, a knowledge of another method is desir- 
able. This method, which is known as the method of sum- 
mation,! affords successive checks upon the numerical work 
and is especially valuable if an adding machine is at hand. 

Suppose that the frequencies of the following distribution 
were summed accumulatively from the bottom up, as shown 


11t seems that the method of summation was used for some time before 
its definite connection with moments was established. See Elderton’s 
Frequency Curves and Correlation, p. 19. 
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to the right, and that we designate the final sum or the sum 
at the top by S1. 


2 y Set 1 

1 Yi Yrtyetyat ... +Yn=S1 
2 Y2 yoty3+ ests, aime 

n Yn Yn 


If, then, we denote the nth moment by M, (so that the nth 


unit moment would be im) we have evidently 
0 


Me= Shui ae Vr le bamnge 


Now let us repeat the process of accumulative summation, 
but on Set 1, to give 


Set 2 


yit2ye+3y3st ... + NYn=S2 
yot2y3t ... t(n—l)yn 
yat... t(n—2)yn 


Designating the top sum by Se, we have, evidently, 


WEBS) se Sek Oe ae 3) 
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Repeating the process upon Set 2, we obtain 


Set 3 
1 
yit+3yo+b6yst ... es y= Ss 
yotsyst .-- 
Pe aoe 


s/s 2 6/6 & 


In this and the following cases, the relation between the 
M’s and S’s becomes more complicated. In this case 


2S3= 22yi:+6ye+12ys+ ... +n(n+1)yn 
-Se= —yi—2y2— 3y3— ... —NYn 
Mo= yit4yet Syst ... +7n7Yyn 


Hence, 
Mg=2S3—S2, . . .. « . @) 


Repeating the process once more, but upon Set 3, we obtain 


Set 4 
yitdyot Lys... -MEFDOTA 9, 
Ue Sus ass 
Y3+ Siiey 


Un 
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It is easily verified, as in the preceding case, that 


Msz=6S.4>6S3+S2 ...... =. (D) 


Likewise, 
M4=24S5—36S1+14S3—S2 . . . . (2B) 


and the process could be extended indefinitely. 

It is especially important to note the form of the general 
term in each sum S and to note that this general term is also 
the last term. 

Since Mo=8Si, we can divide the left side of each of the 
equations obtained above by Mo, and the right side by Si, to 
give the successive unit moments (but not about the mean) in 
terms of the accumulative sums. Since the essential forms of 
the equations would remain unaltered, it is unnecessary to 
rewrite the equations, but it is well to note that if they were 
rewritten we should naturally replace each M by the corre- 
sponding v’ and each S by, say, s where 


_ Sn 
=z. 


(F) 


Sn 


Since we shall practically always require the values of the 
moments about the mean, it remains simply to substitute the 
expressions obtained above (but in terms of v’, and s,) in equa- 
tion (41), page 120. The substitutions are perfectly direct 
and so are left as exercises. The results are as follows: 


d=sa, 

v2 =2s3—d(1 +d), 

je=6s0—-802(1--4) dd 4-d)(24-a), . (46) 
04 = 2585 —2va(3-+2d) —v2{6(1-+d)(2+d)—1} 


—d(1+d)(2+d)(3-+d), J 


etc. 


128 MOMENTS 


As an illustration, let us apply the method of summation 
to the distribution considered previously in this chapter. 


Class Frequency 


x y 

1 ii 121=S, 555=S2 1736=Ss 

2 11 114 434 1181 

3 16 103 320 747 

4 17 87 217 427 

5 26 7 130 210 

6 31 44 60 80 

7 rs 13 16 20 

8 1 2 3 4 

9 1 1 1 1 
121=S, 555=S. 1736 =Ss 4406 =S4 


7 _ 2(1736) 


121 


—4.59(5.59) =3.06, 


(c= V3.06=1.75), 
and similarly for the higher moments. 


It should be clear that even if the values of x do not begin 
with unity and do not advance by naturally successive integers 
as in this illustration, we can assume for the time that they do. 
The value of each moment, say, the nth, found for the assumed 
distribution, should then be multiplied by the appropriate 
power, say, the nth, of the unit of measurement (class interval) 
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and the value of d so found should be added to the class mark 
which is next below the lowest class mark given originally. 
Thus, if the original class marks were, say, 57, 62, 67, etc. 
(instead of 1, 2, 3, etc.), we should assume 1, 2, 3, ete., and 
then multiply the value of each moment as found, by the 
appropriate power of 5. The value of the mean would then 
be 52+-5d. 

It will be noticed that the total of each column is an S 
and that that particular S should be reproduced at the top 
of the next column in each case, thus affording a check upon 
the work of summation. It should also be noticed that 
the work could be performed very efficiently on an adding 
machine. 

A natural extension or modification of the method of 
summation treated above will be considered in the next 
article. 

53. A Natural Extension of the Method of Summation.— 
We have already noted the proper procedure in the method 
of summation when a different unit of measurement is tem- 
porarily assumed and when a different location of the origin 
is temporarily assumed, when, however, the latter yields a 
set of positive class marks. We have found, however, that 
the numerical work of the direct method of computing the 
values of moments is much simplified if the origin is first moved 
to some convenient place near the mean. This is true also 
of the method of summation, and since this scheme practically 
always involves the use of some negative class marks, we have 
still to consider the proper procedure in the method of sum- 
mation when some of the class marks are negative. 

It should be evident at once that formulas (46) still hold 
for the frequencies with positive class marks and that we 
have to deal only with those with negative class marks. 

Attention was called in the preceding article to the form 
of the general term in each sum S and to the fact that this 
general term was also the last term in the sum. Remembering 
that the accumulative summation always begins with the 
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term having the class mark of greatest magnitude (negative 
or positive), when n is negative, the last term of 


S; (Le., yn) becomes y_n, 


Spo (L.e., NYn) becomes —ny—n (a) 
S3 (ic. or) becomes S 1) Ss (8) 
& (ic. ee us) Heanes a yen (y) 


ete. 


But (a) is the (negative) sum of n of the y_, terms of the 
preceding set; (8) is the sum of n—1 of the y_, terms of the 
next set; (vy) is the (negative) sum of the n—2 of the y_, terms 
of the next set; and so on; all of which shows that in finding 
the accumulative sums S;, S2, S3, ete., for negative frequencies 
the summation should include all of such frequencies for S; and 
So, all but the last for S3, all but the last two for S4, and so on. 

* The S’s should, of course, be obtained separately for the 
positive and for the negative frequencies and their algebraic 
sum should then be divided by S, to give the unit s’s to be used 
in formulas (46). The entire process is illustrated below in 
connection with the distribution considered previously. 


a y = ae = 
1 re 7 7 7 tl 
2 11 1s 25 32 39 
3 16 34 59 91 
4 17 51 110 
5 26 
6 3l 44 60 80 105 
i nit 13 16 20 25 
8 1 2 3 4 5 
ie) 1 1 1 1 1 

121 60 80 105 

G0 == 110 
d=—.5.— = — 0.41, 
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(mean is 5—0.41=4.59) 


_ 80491 
= pil 


= 1.41, 


and, by (46) 
v2 = 2(1.41) —(— 0.41)(0.59) =3.06, 
and similarly for the higher moments. 


The proper procedure for distributions with different class 
marks is the same as that described in the preceding article, 
except possibly the preliminary correction of the trial mean 
(illustrated above). 

It will be noticed that this modification of the method of 
summation has the advantage, over the method considered 
in the preceding article, of dealing with smaller sums and 
therefore simpler computations. It, however, suffers some 
loss in the system of checking. 

54. Moments by Integration.—It is well to call special 
attention to the fact that there are many important problems 
which would require considerable preliminary work in deter- 
mining moments by the caleulus—by integration—using for- 
mula (40’). This would be especially true in extensive and 
systematic work in fitting curves to given data. As the latter 
type of work is beyond the scope of this book, we shall do no 
more than merely call attention to the possible importance 
of that work in moments. A sufficient amount of that kind 
of work for our purpose will naturally arise as we proceed. 


EXERCISES 


1. Verify formula (D), Art. 52. 
2. Verify formula (E), Art. 52. 
8. Verify formulas (46), Art. 52. 
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4, The chest measurements (in inches) of 10,000 men are given as 
follows: 


Inches Inches 

33 6 41 1640 

34 35 42 1120 

35 125 43 600 

35 338 44 222 

37 740 45 84 

38 1303 46 30 

39 1810 47 5 

40 1940 48 2 
Find the values of the mean and the standard deviation by the 
summation method given in Art. 52 Ans. Mean=39.835 
g= 2.052. 


5. Same as Ex. 4 but by the modification of the summation method, 
given in Art. 53. 

6. Same as Ex. 4 but for another distribution of the same kind of 
measurements. 


Inches Inches 
33 5 41 1628 
34 31 42 1148 
35 141 43 645 
36 322 44 160 
37 TEN. 45 87 
38 1305 46 38 
39 1867 47 7 
40 1882 48 2 


7. Same as Ex. 5 but for the distribution given in Ex. 6. 

8. Compute the values of the mean and the standard deviation, by 
the method of summation, of the following distribution of observations 
(deviations in seconds of time) of the right ascension of Polaris. 


Deviations Deviations 
—3.5 2 0 168 
—3.0 12 0.5 148 
—2.5 25 1.0 129 
—2.0 43 15 78 
—1.5 74 250 33 
—1.0 126 2.5 10 
—0.5 150 820 2 
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9. Show that the mean (i.e., the abscissa of the centroid) of the area 
between the curve y=z?, the x-axis, and the ordinates x=0 and «=1 
is 3/4. 

10. The equation of one of the types of the co-called Pearson 
frequency curves is 


Y=oe “De. 


Substitute yz for x and show, by integrating from 0 to 0, that the 
zeroth moment, or area, N is yoy!” [ (p—1). 
11. Find the nth unit moment (u’,) of the curve given in Ex. 10. 
Use the result obtained in Ex. 10, and also the same substitution. 
=H) 
Ans. p' n=" [eee =. 
PTT @-D 
EXERCISES WITH DISTRIBUTIONS OF THREE 
DIMENSIONS 


1. Twelve dice of which six were marked red, the rest being white, 
were thrown and the number of faces showing above 3 was noted, 
to give a “first throw.” The red dice were now left down and the 
white dice thrown again. In this second throw the total number of 
dice (red and white) now showing faces above 3 was noted, to give a 
“second throw.”’ This process was repeated 500 times to give the 
following distribution: 


Second Throws 


r= 2 a 4 5 6 7 8 9 10 
y= 1|1 1 1 

211 2 3 2 

ate 3 5 6 2 6 
Ped | 5 9 8 it 16 7 6 1 
pe ai? 5 17 24 19 25 11 2 
roll 5 14 25 24 24 17 4 3 
Ee 2 “ee ES yg eb) 2 
eS 2 7 13 22 14 5 3 
sali) 3 5 6 er ae 2 

10 2 1 2 

11 1 
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Compute the values of the mean and the dispersion of 


(a) the “first throws”; 
(b) the “second throws.” 


2. Compute, for the distribution given in the preceding exercise, 
the value of what is called the (unit) product moment, designated and 
defined by the relation: 


, _ =xry 
v uN? 


where x and y represent corresponding values of those variables, and 
N is the total number of such pairs of values. 


8. It will be shown later (Art. 78) that the (unit) product moment 
about the centroid of a distribution of three dimensions is given by the 
relation 


, i 
Vry=Vv zy —hk, 


where, in computing v’zy, « and y are measured from trial means of the 
values of these variables and A and k are corrections to be applied to 
these trial means. 

Assume trial means of both the ‘first throws” and the ‘second 
throws” in the distribution of Ex. 1 to be 6 and compute the value of 
the product moment about the centroid. 


4. Apply the formula given in the preceding exercise to the values 
of the means found in Ex. 1 and the result obtained in Ex. 2 to check 
the value of the product moment about the centroid obtained in the 
preceding exercise. 


5. The following distribution gives the results of an investigation 
into the relation between temperature and rain precipitation. The 
frequencies refer to months, and the scales at the top and at the side 


to the amount of monthly precipitation and monthly mean temperature 
respectively: 


New York 


Temperature 


if f5) 
27.5 
37.5 
47.5 
57.5 
67.5 
(7.5 
87.5 
97.5 
107.5 
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Precipitation (Inches) 

2.15 3.75 4.75 (5) (6), 745) 
7 8 8 12 4 
25 21 16 28 ilal 
14 15 15 31 U 
13 10 11 14 7 
14 6 5 14 3 
6 1 4 7 1 
2 2, 1 3 2 

2 1 

1 1 


Compute the values of the mean and the dispersion of 


(a) the “temperature” distribution; 
(b) the “precipitation”’ distribution. 


6. Compute the value of the (unit) product moment about the 
centroid of the distribution given in the preceding exercise. 
to give due consideration to the units of measurement. 


Be sure 


7. The following distribution gives the corresponding maximal daily 
July temperatures in New York and Boston for the years 1911-1920: 


61 64 67 


— 
= 


1 


mS eS or bo 


LOM (oie co 


SPnNnnNnNaAwWO®- 


wWOwnanrnrebd bv 
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91 94 97 100 103 


Boston 
79 82 85 88 
1 1 

1 
a 6 8) 
7 9 4 8 
) by 
Py fe IO 
Py 8} 
i a 1 


— 
SBP Wore Wd 


wn bv 


1 
3 
3 


2 
3 


1 
il 


Compute the value of the product moment about the centroid. 


CHAPTER VIII 
THE NORMAL CURVE 


55. The Normal Curve.—We referred in earlier sections to 
the tendency of a certain kind of errors to compensate or 
offset each other. It will be remembered also that we called 
attention to the fact that certain kinds of “ deviations,’ 
“residuals,” etc., behaved in like manner and that we would 
agree to refer to the whole general class of such items by the 
term ‘errors.’’ Suppose that we have a large number, say 
several thousand, of such errors which compensate in the 
most ideal manner, and suppose that these errors are recorded 
in classes according to sizes, to give corresponding frequencies. 
The question naturally arises: What would be the most 
natural form of the corresponding frequency curve under 
such ideal conditions? We should probably all agree upon the 
two most important characteristics of the curve. We should 
expect a considerable ‘hump ” or a point of maximum at the 
-~mean—that is, we should expect the smallest errors to occur 
most frequently—and we should expect the curve to approach 
the z-axis on both sides of the vertical axis of reference taken 
at the mean in the same way, so that the curve would be 
symmetrical with respect to this axis. Such a frequency 
curve would look much like one of the following curves and is 
known by several names: the normal curve, the probability 
curve, the error curve, the Gaussian! curve, ete. We shall 
refer to it as the normal curve, and to any frequency distri- 


bd 


'As a matter of fact, Laplace was an earlier contributor of knowledge 
concerning the curve then Gauss. 
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bution whose corresponding frequency curve is a normal 
curve as a normal distribution. 


o go 


Fia. 3. 


Now it is only too obvious that we can no more expect 
a frequency distribution, even under the most ideal conditions, 
to prove truly normal in practice than we can expect an a 
priort probability to be verified exactly in a large number of 
trials in practice. A certain amount of variation is to be 
expected under any circumstances, and the discrepancies will 
vary in different distributions all the way from discrepancies 
which are so small that they would naturally be expected to 
discrepancies large enough to rule the given distribution out 
of further consideration as a normal distribution for practical 
purposes. Since truly normal distributions can not be expected 
in practice, we shall take the liberty of referring to many 
frequency distributions as normal distributions which are only 
approximately so but which differ from truly normal dis- 
tributions so little that the variation might easily be ascribed 
to random variation. Thus, the following distribution, which 
gives the actual results of shooting one thousand times at a 
target consisting of a vertical line, would probably be regarded 
as a normal distribution in the sense expressed above: 
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Deviations Frequencies 
x y 

—5 tas 
—4 4 
—3 10 
—2 89 
—1 190 
0 212 
1 204 
2 193 
3 79 
4 16 
5 2 
1000 


After all, whether a given distribution is normal or not normal 
is not of great importance. We shall find as we proceed that 
the matter which is of the greatest importance is whether the 
distribution, of which we may have only a few sample numerical 
observations, could be assumed to be sufficiently normal to 
permit us to draw certain conclusions which we shall consider 
later. There are diversities of opinion and attitude in regard 
to the whole question, and no absolute criterion or rule can be 
laid down; but it has been found from experience that the vast 
majority of all frequency distributions of deviations from the 
mean of numerical observations made upon a size or character- 
istic of a single object, where there is cogent reason for believing 
that deviations in one direction are just as probable as devia- 
tions in the other direction, are normal distributions in the 
sense adopted above. On the other hand, distributions of 
deviations of observations made upon a characteristic of 
several objects, even though those objects belong to the same 
general class, can rarely be expected to be normal. Thus, the 
distribution of a large number of refined readings of a barom- 
eter, made under proper conditions, is very apt to be normal, 
while the distributions of lengths of a large number of ears 
of corn or of leaves of trees, of the various incomes or of popu- 
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lations by ages of a large community, are apt to vary con- 
siderably from normal distributions. 

56. The Derivation of the Equation of the Normal Curve.— 
If we take the origin at the mean of the normal curve, the slope 
of the curve at x=0 must be zero. Moreover, the curve would 
approach the x-axis alike on both sides of the y-axis so gradually 
that the slope of the curve would approach zero as x increased 
in absolute value without limit or, what amounts in this case 
to the same thing, as y approached zero. All these properties 
of the curve can be expressed algebraically by the differential 
equation 


Ci 
a —kan > 


where the negative sign is inserted to insure that there shall 
be a maximum and not a minimum at «=0, and k is a constant. 
As a matter of fact, it does not follow at all from what has been 
said that k must be a constant; but if the slope of the curve 
is to be zero nowhere else than where we have indicated, k 
must be either a constant or a function of the form 1/f (x). It 
remains, then, merely to explain that the differential equation 
corresponding to the latter alternative has been investigated 
at great length, especially for the case where it is assumed 
that f(z) can be expanded in the form of a power series. The 
latter assumption leads to a wonderful system of frequency 
curves (known as the Pearson types? of frequency curves) 
which include the normal curve as a special case, where k is a 
constant. It is an interesting fact in that connection that 
practically none of these frequency curves (except the normal 
curve) is symmetrical with respect to the vertical axis of 
reference and some of them do not have a maximum at 7=0 
or have no maximum at all. (How can these facts be recon- 
ciled with the form of the differential equation?) We are 
justified, then, in assuming k to be a constant. 


2 See Pearson’s ‘‘ Tables for Statisticians and Biometricians.” 
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Writing the differential equation given above in the form 


Ldy 5 
y dt ka, 


and integrating, we obtain 


2 
log y= —k-z +log yo, 


where the constant of integration is written as a logarithm for 
purposes of combination. The final equation can then be 
written 


2 
Zz 
ky 


y=ye ?. 

It simply remains, then, to investigate the constants k and 
yo and show that they can be expressed in terms of characters 
with which we are already familiar. If we let N denote the 
area under the curve then 


00 72 
N= if ye *dx. 
—o 


If this integration be performed by parts, with dv=dz and 
was the exponential expression, we obtain 


ee) 12 _22\ © 2 
By ia i pak 
wf oO The Yo (xe “ ) = wk x7e 2dz. 
—0 =—c —0 


The first expression on the right evidently takes the inde- 

: 00 ae : ; 
terminate form -s for each limit but is found, by following the 
process outlined in a preceding section, to have the value 
zero. The integral on the right is to be recognized as Nk 


times the second unit moment », of the normal curve. Hence, 
we have 


N=Nkw, 
or 
te gut 
k= >=, 


where o is the standard deviation of the curve. 
The definite integral given above can be valuated in another 
way. Referring to equation (34) in the chapter on Gamma 
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and Beta functions, it should be evident that the value of the 
integral is also 


pes ae 
N=, ae => YyooV 2r. 
Hence 
Yo= a = 
Pavan 
The equation of the normal curve may then be written in 
the final form 
IN de 
YEO ye omni ce) tee CLT) 


oV 20 


It will be recalled that points of inflection of a curve are 
points where the curve leaves off being concave downwards 
to become concave upwards, or vice versa, and that such points 
are found by equating the second derivative to zero and solving. 
It is easily verified by that process that the points of inflection 
of the normal curve are at x= —o and x=c. 

The equation of the normal curve is sometimes expressed 
in terms of what is called the modulus c, which is defined by the 
relation c=oV2, and sometimes in terms of what is called the 
precision h, which is defined as the reciprocal of the modulus 
or h=1/e: 

57. Graduations of Normal Distributions: Tables of 
Ordinates.—The final form of the equation of the normal curve 
derived in the preceding section involves N and o (and the 
constant 7); the former is the area under the curve and would 
be given approximately by the total frequency of the corre- 
sponding distribution; an approximate value of o would be 
given by the value of the standard deviation of the distribu- 
tion. As the points of inflection of the curve are located at 
x=o and r= —s, it is evident that for large values of o the curve 
is dispersed or spread out somewhat like curve B (in Art. 55) 
and that for small values of o the curve assumes a more com- 
pressed or steeper form similar to curve A. The form of the 
curve then depends solely upon the value of o (assuming the 
units of measurement upon the two axes to be the same); in 
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fact, if we replace x/o by x in the equation and yo/N by 2, 
we obtain 


— 
| 
|e 


2= ae ’ 
V 20 


which is obviously independent of c. Extensive tables? of 
values of z have been compiled which are to be entered with 
values of a/o to give values of z which may be adjusted to cor- 
respond to the frequencies of any normal distribution; a very 
small table is given at the end of the next article. 

Given any normal distribution, suppose that the values of 
the mean and the standard deviation ¢ are computed; and then 
that a column of values of x/o corresponding to the values of 
x with the origin taken at the mean is set up, and these values 
of «/o are used to enter a table of values of z. If the values 
of z so obtained are then finally multiplied by N/c, we obtain 
new or theoretical values of y. This column of theoretical 
values of y constitutes what is called a graduation or “fit” 
of the observed frequencies given originally. It should be 
obvious that a column of values of z/o constitutes a column 
of probabilities of the occurrences of the corresponding devia- 
tions. This explains why the normal curve is frequently 
referred to as the probability curve. As an example, let us 
graduate the following distribution of certain measurements 
(cephalic indices) of Bavarian skulls. It is easily verified that 
the mean is 83.148 and that the standard deviation is 3.32. 
Hence, N/o=900/3.82=271. As the short table of ordinates 
given in the next section is used, no great refinement in com- 
putation will be employed; for example, the frequencies of 
“75 and under” and ‘92 and over” are treated as single 
ordinates in computing the mean and standard deviation 
without serious effects, but the theoretical frequencies for 
those intervals are determined by classes and then combined. 
The details of the process of graduation should be clear after 
a little study of the following results. 


2 Pearson’s ‘‘ Tables.” 
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Measure- Fre- Devia- Y= eos 
ments, quencies, tions, x/o z i 
a’ (m.m.) y! A (to the near- 
est integer) 
10.148 3.06 0.00370 1.0 
9.148 Ee .00909 25| 
75 and under 9.5 8.148 2.45 .01984 IH, Al 9 
76 119) 55 7.148 Pe N53 .03955 iil 
Oe 17 6.148 S85 .07206 20 
78 37 5.148 iL 5S .12001 33 
79 55 4.148 1.25 . 18265 50 
80 alley 3.148 0.948 . 25840 70 
81 82 2.148 0.647 .32652 88 
82 116 1.148 0.346 11am 102 
83 98 0.148 0.0446} .39872 108 
84 107 0.852 0.256 .388726 105 
85 82 1.852 0.558 34446 93 
86 74 2.852 0.859 .28011 76 
87 58 3.852 1.16 . 20357 5d 
88 34.5 4.852 1.46 Bee 37 
89 19 5.852 i .08478 24 
90 10 6.852 2.06 .04780 1B 
91 8 7.852 Panel 02406 Raa 7 
92 and over 9 8.852 2.67 .01130 Bim) 5 
9.852 2.97 .00485 13| 
N =900 10.852 Be .00190 0). 8 


58. Graduations of Normal Distributions: Tables of Areas. 
—A normal distribution can be graduated in another way— 
by means of a table of areas of the normal curve. 
the area under a normal curve of unit total frequency from 
z= —xz/o to r=2/o by a, then 


2 


if = zi eels 7 
a(L a) be Lge 


If we replace x/o by x we obtain 


if J Es 
£ 1 a é 2dx. 
ae) Viele 


nee 
22x. 


If we denote 
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It is easily seen that since $(1+a) denotes the area under 
the normal curve from the extreme left to the positive abscissa 
x/o, then 4(1—qa) denotes the area from r=2/o to the extreme 
right. Extensive tables? of values of 3(1+a) have been 
compiled which are to be entered with values of x/c; a small 
table is given at the end of this article. 

In graduating a normal distribution by means of a table 
of areas, class marks must be replaced by class limits—the 
class limits on the farther side from the mean. After the 
values taken from a table of areas have been multiplied by the 
total frequency N, each area so obtained must be deducted 
from the next greater area to obtain the successive class 
frequencies, except for the one class which contains the mean, 
in which case the areas on the two sides of the mean must be 
added. The process should be clear after a study of the 
following graduation of the distribution of 1000 shots given 
at the beginning of this chapter (Art. 55). It is easily verified 


° se a : Theoretical 
requencies, ass ass uv we Fre uencies, 
y’ Marks Limits, x o 2 te ; y 
1 5.48 5.98 3.80 1000 it 
4 4.48 4.98 OL L6 999 5 
10 3.48 3.98 2252 994 25 
89 2.48 2.98 1.90 971 five 
190 1.48 1.98 125 894 162 
0.98 0.62 732 
212 0.48 { \ 
0.02 0.01 504 — 
204 0.52 1.02 0.65 742 236 
193 1.52 2.02 TE28 900 158 
79 Qeoe 3.02 1.91 972 ie 
16 3.02 4.02 2.54 994 22 
2 4.52 Ue Stk 999 5 
1 
1000 ——_ 
1000 


’ Pearson’s ‘‘ Tables.” 
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that the mean is 0.48 and o=1.58; N=1000. The second 
column gives the class marks measured from the mean and the 
third column gives the corresponding class limits to be used 
to enter the table of areas (the small table given at the end 
of this article is used). 

It should be noted that in the case of the class frequency 
236 the frequency 232 (=732—500) is on one side of the mean 
and the frequency 4(=504—500) is on the other. 

It need scarcely be stated that graduations to the normal 
curve can not be expected to be very satisfactory unless the 
given distribution is itself normal; otherwise, all that can be 


TABLES OF ORDINATES AND AREAS OF THE NORMAL CURVE 


re Ordinates, Areas, le Ordinates, Areas, 
z 3(1+e) z z(1+a) 

0.0 0.39894 0.50000 2.0 0.05399 0.97725 
0.1 39695 . 53983 4), iL 04398 98214 
0:2 .389104 .57926 2.2 03547 .98610 
0.3 .38139 .61791 2.3 02833 .98928 
0.4 . 36827 65542 2.4 .02239 .99180 
0.5 . 35207 .69146 2.5 .01753 .99379 
0.6 .33322 12008 2.6 01358 99534 
0.7 731225 . 75804 2.0 .01042 .99653 
0.8 . 28969 . 78814 228 00792 99744 
0.9 . 26609 .81594 2.9 .00595 .99813 
1.0 . 24197 .84134 3.0 00443 .99865 
ii .21785 .86433 3.1 .00327 .99903 
1.2 . 19419 88493 3.2 00238 .99931 
i} .17137 . 90320 Dae .00172 .99952 
1.4 . 14973 .91924 3.4 .00123 .99966 
1.5 . 12952 .93319 3.5 00087 .99977 
LG . 11092 . 94520 3.6 .00061 . 99984 
ih .09405 . 95543 Sn ft 00042 .99989 
1.8 .07895 . 96407 3.8 00029 99993 
1.9 .06562 .97128 3.9 .00020 .99995 
4.0 .00013 .99997 

4.1 .00009 | . 99998 

AT2 .00006 .99999 
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said of the graduation is that it constitutes the best fit to the 
normal curve that is possible; a graduation to some other curve 
would in that case probably prove more satisfactory. If the 
given distribution is normal it is reasonable to expect a gradu- 
ation to the normal curve to be the best fit regardless of the 
curve employed. However, though the normal curve is 
clearly the curve to which a given distribution should be gradu- 
ated, the given frequencies may be so ‘“ rough ’’—probably 
because of the relatively small number of observations—that 
the graduated or theoretical frequencies may not fit the 
observed frequencies very closely. In such a case the gradua- 
tion must necessarily be a poor one but through no fault of the 
method of graduation, 


EXERCISES IN GRADUATION 


The following distributions give (1) the statures of 802 Cairo-born 
Hegyptians and (2) the statures of 739 Smith College girls (1914-15). 
Fit each to the normal curve (a) using the table of ordinates, and (bd) 
using the table of areas. 


Cms. (1) Inches (2) 
149.5 4 59 1 
153.5 28 60 2 
157.5 73 61 2 
161.5 185 62 11 
165.5 212 63 11 
169.5 167 64 48 
iba h aay 87 65 45 
Lieob 31 66 97 
181.5 14 67 100 
185.5 ie 68 126 
69 103 
70 97 
71 45 
72 46 
73 4 
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EXERCISES 


It has been found possible to select the most appropriate type of 
the Pearson frequency curves for fitting a given distribution from 
values of the following functions of moments: 


K3 
(i ae 
Kg 
Ma 
[So ee 
la) 


computed for the given distribution. 

1. Show by actual integration that for the normal curve B.=3. 
(Suggestion: Integrate the expression for wu: by parts with uw=2?.) 

2. Show by integration that for the normal curve 6,=0. 

3. Show that 6.=3 for the normal curve, by integrating the expres- 
sion for ue by parts with dv=27dz. 

4. Show that for the normal curve and for n, an even number, 


as Mn+2: 
fe (need yo? 


Integrate u, by parts with dv=a"dz. 
5. Show that for the normal curve 


Me 
9) 
M2h4 
; bg 
and —=7, 
M2M6 


6. Compute the value of 8, for (a) the distribution of shots and 
(b) the distribution of measurements of Bavarian skulls, given in the 


text. 
7. Assuming the distributions graduated in the text to be truly 


representative, determine the values of the probabilities: 


(a) Of the occurrence of a cephalic index of 80 mm. 
(b) Of the occurrence of a cephalic index greater than 
80 mm. 
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8. Referring to the tables of ordinates and areas, what are the 
probabilities: 


(a) Of an occurrence in a normal distribution of a devia- 
tion of 1.460? Of a deviation of 3.840? 

(b) Of an occurrence of a positive deviation greater 
than 1.460? Of any deviation greater than 3.840? 

(c) Show that the probability of obtaining any deviation 


greater than = (i.e., in absolute value) is 1—a. 
oc 


(d) How could values of 1—a@ be obtained readily from a 
table of values of }(1+a)? 


59. Least Squares.—If we assume a given distribution of 
residuals or errors to be normal, the probability of the occur- 
rence of a given error (z;—x), where 2, denotes the numerical 
value of the particular observation and x denotes the theo- 
retical value according to a given hypothesis, is 


Pr= ke—o@r—=)?, 


where k and a are constants with which we should now be 
familiar. Then, if the occurrence of each error is independent 
of the occurrence of any other error, and all the errors are to 
be regarded as of equal weight, the probability P of the joint 
occurrence of all the errors is the product of all the corresponding 
values of p; or, 


P=Ke-a{ (x1—2)?-+ (x2—2)? Fete. } | 


where AK denotes the product of all the values of k and is, of 
course, constant. If the errors were not to be regarded as of 
equal weight the value of a would evidently vary accordingly. 
The probability P is evidently greatest when the sum of 
the squares of the errors is the least; therefore, that distribu- 
tion of errors is most likely which makes P have the greatest 
value or the sum of whose squares isa minimum. This principle, 
known as the principle of least squares, is very valuable as a 
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basis for the solution of many important problems. As a 
complete explanation of the possible application of the method 
of least squares would require the development of a certain 
amount of technique, and as this development is given in 
almost any one of several treatises devoted entirely to that 
subject, we shall restrict our attention to those applications 
which will prove sufficient for the purposes of this course. 

The general form of application of the method of least 
squares which we shall consider concerns itself with the fitting 
of polynomials of the form y=a+ba+cx?+etc., to given 
pairs of values. For this purpose it is well to note that the 
errors or residuals mentioned above will be the differences 
between the observed values of y and the corresponding 
theoretical values of y given by the polynomial, and the 
general problem consists in determining the values of the 
coefficients a, b, c, ete., which will make the sum of the squares 
of the differences between the corresponding values of y a 
minimum. The determination of these coefficients consti- 
tutes a problem in minima where, however, partial differentia- 
tion is usually necessary. A simple example should make the 
detzils of the general solution clear. Let us fit the linear 
expression y=a+bz to the three pairs of coordinates (1, 2), 
(3, 9) and (5, 14). The residuals are then a+b—2 (the value 
of a+bz for x=1, minus the observed value 2),a+3b—9 and 
a+5b—14 and we are to determine for what values of a and 6 
vhe sum of the squares of these residuals is a minimum. It 
is easily verified that the derivatives of 


(a+b—2)2+ (a-+3b—9)?+ (a+5b—14)?, 


with respect to a ard 6 reduce to 3a+9b—25 and 9a+35b—99 
respectively. Setting these two expressions equal to zero and 
solving simultaneously, we obtain the required values of a 
and b; since these equations are the same as those obtained 
for the same problem by the method of moments, the values 
of a and b must be the same as found previously or a= —3 


and b=83, and the final equation must be y=32—3. 
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EXERCISES 


For suggestions for solving the following exercises see similar 
exercises under Moments. 
1. The following results were obtained from an experiment on a 


screw-jack. 
Je ANG 19 29 46 51 66 78 89 101 113 
Ji fi) 1 1.5 2 2.5 3 3.5 + 4.5 5 


Determine a linear relation R=aH+b between the effort E and 
the load R. 

2. In the following data, which are known to follow a law repre- 
sented approximately by R=aE-+, there are errors of observation. 


24.5 27.25 30.5 33.5 36.5 


Hao some l22Oun loo) LS om alee 
6 7 8 $) 10 ll 12 13 


E 4 5 


Determine the most probable values of a and_b. 

3. In a tensile test of a mild steel bar the following observations 
were made, where W represents the load in tons and x the elongation 
in inches. (The bar had an initial length of 8 inches and a diameter 
of 0.748 inches.) 


W 1 2 3 aT 5 6 
x 0.0014 0.0027 0.0040 0.0055 0.0068 0.0082 


Determine the relation «=aW-+b. 

4. A wire under tension is found by experiment to stretch an 
amount Z in thousandths of an inch under a tension 7’ in pounds as 
follows: 

eee) 15 20 25 30 
L 8 12.5 15.5 20 23 


Determine a relation of the form L=kT (Hooke’s Law). 

5. A restaurant keeper finds that if he has @ guests a day his total 
daily expenditure is # dollars and his total daily receipts are R dollars 
The following data are averages obtained from the books. 

G 210 270 320 360 


E 16.7 19.4 21.6 23.4 
R 15.8 21.2 26.4 29.8 


The data of Exercises 4-7, 12-14 ave from Kenyon and Lovitt’s “‘ M i 
Collegiate Students of Agriculture and General Science,” Ee 
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Determine the relations R=mG and E=aG+b. What are the 
interpretations of m,aandb? Below what value of G does the business 
cease to be profitable? 


6. If a body slides down an inclined plane, the distance S in feet 
that it moves in ¢ seconds after it starts is represented by the equation 
S=kt?. Determine the best value of k consistent with the following 


data: 
IS BAG Opel 23 40.8 65.7 
t il 2 3 4 5 Ans. k=2.56. 


7. An alloy of tin and lead containing x per cent of lead melts at 
the temperature y (F.) given by the values: 
2 25 50 75 
y 482 370 356 
Determine the relation y=a+ba+cz?. 
8. The weight of ten liters of water was found at different tempera- 
tures 7’ (C.) and the loss in weight (in g.) W as the temperature differed 
from 4° as follows: 


W Iba 0.3 0 0.3 2 Py. 0f 4.8 7.3 
IP 0 2 4 6 8 10 12 14 
Wiel On 13.8 IC tl 22. 26.8 31.9 37.1 43.3 
i 16 18 20 22 24 26 28 30 


Determine the relation W=a-++-bT+cT?. 

9. It is claimed that if the brake mechanism is satisfactory and 
road conditions are average, any automobile should stop at distances 
and speeds as follows: 

Speed per hour, v.... 10 is) PAO) 30 35 40 50 
Distance in feet,d.... 9.2 20.8 37 58 88.3 104 148 231 

Determine the relation d=av?+bv+e. 

10. Observations upon the corresponding temperature T (F.) and 
pressure P of steam in units of 10 were made as follows: (Atmos. 
pressure = 14.7 lbs.) 


T 240.0 259.2 274.3 286.9 297.8 
le 1 2 3 4 5 
T 307.4 316.0 323.9 331.1 337.8 
IP 6 7 8 9 10 
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Compute the differences and estimate the appropriate degree of 
the polynomial to be fitted. Determine the polynomial. 
11. The following data give the velocity of water (feet per sec.) 
and head in units of 10 feet. 


V 25.4 35.9 43:9 50.7 (56.7 .62.1 67.1 71.3 70.0 sie 
fs al 2 3 4 5 6 7 8 9 10 


Determine the relation V=aH?+bH-+c between velocity V and 
head H. 

12. A strong rubber band stretched under a pull of x kg., shows an 
elongation of y cm., as given by the observations: 


a 1 ihe 2 2.5 a (325 4 4.5 5 
y 1 3 6 Q dee) Lal (228 et eee 
Determine the relation y=k2". Ans. y= 321, 


13. The intercollegiate track records for foot-races are as follows, 
where d is the distance run and ¢ the record time. 


d 100yds. 220yds. 440yds. SSO yds. 1 mile 2 miles 
t  0:094 0:212 0:48 1:544 4:152 9:242 


Determine a relation of the form t=kd". What should be the 
record time for a race of 1320 yds.? 

14. The corresponding ages in years and diameters in inches of a 
tree with an initial height of 14 feet were found to be as follows: 


Age, y aa, 58 114 140 181 229 
Diameter, z..... 3 i 1332 17.9 24.5 33 


Determine the relation y=ka". 


15. Vapor pressures, in mm. of mercury, of methyl alcohol at 
various temperatures were found by experiment to be: 


t 0 5 10 15 20 25 30 35 
1 Sh 40 54 71 94 123 159 204 


Determine an appropriate relation. 
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16. The temperature of a heated body cooling in the air was taken 
2ach minute, the results being tabulated as follows: 


t 0 i 2 3 4 5 
T 84.9 79.9 75.0 70.7 67.2 64.3 
t 6 a 8 9 10 

fl (ol) H).9) 57.6 55.6 53.4 


The temperature of the air was 20 degrees. Determine an appropri- 
ate relation between the temperature 7' and the time t. 


60. Probable Error in a Single Observation. A Rough 
Method of Computation.—We have shown why the standard 
deviation may be used to measure the consistency of a set of 
observations. A slightly better measure of such consistency 
is given, however, by what is called the probable error in a single 
observation. If a distribution is normal the probable error 
may be defined as the deviation from the mean which, taken 
with_ both the positive and negative signs, constitutes the limits 
of one-half of the total frequency. If the probable error so 
defined is, say +k, and if the distribution is representative, 
the probability of another deviation selected at random falling 
between —k and +k equals the probability of its falling without 
that interval. A rough method of dealing with a distribution 
is to compute the mean, say a, of the positive deviations, and 
the mean, say 6, of the negative deviations, and to employ 
the arithmetic average of the absolute values of these means 
as the probable error, or the following relation: 

Approximate probable error in a single observation 


Ba a ie abe end?) 


Such a rough application of the definition of the probable 
error in a single observation can be expected to give only roughly 
approximate results and is introduced here mainly to help 
clarify the conception of probable error; little refinement in 
the calculation should then be employed. As an example, let 
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us consider the distribution of litters of mice considered earlier 
in this course. 


Number in Frequencies, Deviations, 
Litter y z xy 

1 7 —4+ — 28 
2 iit —3 — 33 
3 16 —2 — 32 
4 17 —1 — 17 
5 26 —110 
6 31 1 at 
7 11 2 22 
8 1 3 3 
9 1 + + 

121 ; 60 


For purposes of approximation it will be well to regard 
one-half of the frequency corresponding to the deviation of 
zero as belonging to the positive deviations and one-half to 
the negative deviations. Then 


2-1. 
57 
and 
—110 = 
ks es i a 
Hence, the probable error in a single observation is approxi- 
mately +3(1.0+1.7) or 41.8. If this distribution were repre- 
sentative, the probability of the deviation corresponding to 
another litter selected at random falling between —1.3 and 
1.3 would equal approximately the probability of its falling 
without that interval. It can scarcely be emphasized too 
much that the principal means of obtaining a representative 
distribution is to make a large number of observations. If, 
however, a distribution includes a sufficiently large number of 
cases to be representative, the method of computing the 
probable error given in the next section should be employed. 
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EXERCISES 


Find a rough approximation of the probable error in a single 
observation for the distribution of: 

1. Cephalic indices of Bavarian skulls. 

2. The results of shooting 1000 times at a vertical line. 

3. Enter the table of areas of the normal curve given previously 
and estimate the value of «/o¢ which corresponds to the probable 
error. 


61. Probable Error in a Single Observation. The Standard 
Method of Computation.—Most distributions call for a more 
refined method of computing the probable error than the one 
considered in the preceding section; and even in distributions 
which involve too few observations to be representative, the 
more refined method which we shall now consider will give 
just as good resuits if the ordinary rules for numerical com- 
putation are observed. If we enter the table of areas of the 
normal curve backward with 4(1+a)=0.75 we find that 
x/o=0.6745.... It follows then that one-half of the total 
frequency of any normal distribution lies between «= — 0.67450 
and x= +0.6745c. We write then that the 


probable error in a single observation = 0.67450 .... (48) 


Thus, in the case of the distribution of litters of mice, where 
o=1.75, the probable error in a single observation becomes 
+0.6745(1.75) or about +£1.18. 

The probable error is one of the most valuable tools of the 
statistician, or even of the scientific in,estigator in general 
and, as will be indicated in a later article, can be adapted 
and applied to many forms of measurement. It is all very 
well to know, say, the value of the arithmetic average of a large 
number of measurements, but such a value would be worth 
much more if it were attended by the value of the probable 
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error, giving information in regard to the consistency of all 
of the observations. Thus, since the average size of the 
litters of mice was found to be 4.59 the result would be of 
greater value if written 4.59(+1.18). 

The value of the standard deviation is frequently used 
instead of the probable error for much the same purpose and 
is usually referred to in that case as the standard error. It 
has been found by experience that the occurrence of a devia- 
tion of more than “ three’ times the probable error is either 
very unlikely or due to peculiar influences not covered by 
the investigation. Thus, one would be justified in concluding 
that a size of a litter of mice of 4.59+3(1.18) or of 8 and 
over would be very unlikely if the distribution cited above is 
representative. Deviations are found occasionally, however, 
which seem to be fairly normal and yet which exceed slightly 
three times the probable error; on the other hand, such devia- 
tions practically never exceed three times the mean error 
and hence there are a few authorities who prefer to employ 
the latter criterion in testing a given deviation. We shall 
take the liberty of referring here to “ three times the probable 
error”? as the maximum probable error and ‘ three times the 
standard error ”’ as the maximum error, for purposes of distinc- 
tion, although where no distinction is necessary we shall 
follow the general custom and use the maximum probable 
error. 


EXERCISES 


Compute the probable error in a single observation for the following 
distributions and note whether any deviations exceed the maximum 
probable error: 


1. Of cephalic indices of Bavarian skulls. 

2. Of the results of shooting 1000 times at a vertical line. 

3. Of the statures of Cairo-born Egyptians (given in a preceding 
exercise). 

4. Of the statures of Smith College girls (given in a preceding 
exercise). 
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62. Field Artillery.—While little interest will probably be 
felt in the subject of field artillery itself, it offers opportunities 
for good but simple applications of the idea of probable error 
which should add familiarity and confidence in the use of that 
idea. 

The following abridged table of probable errors was taken 
from a much more complete table based upon the firing of over 
5000 rounds with the 3-inch gun. 


PROBABLE ERRors (Allin yards) 


Range Range Vertical Deflection 
1500 39 ie 3.1 
2000 34 2.4 4.4 
2500 31 3.2 5.6 
3000 29 3.9 7.0 
3500 28 4.9 8.6 
4000 27 5.8 10.4 
4500 26 6.8 i, i 
5000 25 os 14.0 


If we speak of the 50 per cent zone as that determined by the 
probable error, the width of a zone of any per cent can easily 
be obtained from the table of areas of the normal curve, 
given previously. We shall refer to the ratio of the width 
of any zone to the width of the 50 per cent zone as a proba- 
bility factor; a table of these factors can easily be set up, 
and will prove useful in connection with the values of the 
probable errors given in the table above. An abridged table 
follows: 


Per Cent Probability, 
Factor 
10 0.18 
20 0.38 
30 Onam 
40 0.78 
50 1.00 
60 (25 
70 1.54 
80 1.90 
90 2.44 
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The values of probable errors for any range not given in 
the upper table and the values of probability factors for any 
per cent not given in the lower table can easily be determined 
by ordinary interpolation. It scarcely needs to be added that 
complete tables would be available and that little interpola- 
tion would be necessary in actual artillery fire. 

As an example illustrating the use of the tables given above, 
suppose that it is desired to know the location of the center 
of impact (the center of gravity in range of the points of fall 
or impact) with respect to the target, when out of 12 shots 
one shot is observed to be ‘short’? and 11 are “ over” at 
range 2200. Since 83 per cent of the shots are short we are 
interested in the width of the 834 (=100—2X8%) per cent 
zone. Entering the table of probability factors with 83} 
per cent, we interpolate 2.08 for the probability factor. The 
width of the 804 per cent zone is then 2.08 times the width of 
the 50 per cent zone at 2200 yards (which according to the 
table of probable errors is 2X33 or 66) or 137 yards. The center 
of impact is then probably 68} (one half of 137) yards beyond 
the target. A battery commander would be justified by such 
a large result (or any result appreciably greater than 25 yards) 
in shortening his range. 

Another problem of considerable importance is illustrated 
as follows: Assuming that a piece is correctly laid, what is the 
probability of obtaining a hit upon a target 2 yards high and 
4 yards wide at a range of 3000 yards? At 3000 yards the 
width of the vertical 50 per cent zone is 7.8(=2X3.9) yards, 
and therefore the value of the probability factor is 2+7.8 =0.26 
which (according to the table of probability factors) corre- 
sponds to 14 per cent; that is, 14 per cent of the shots will take 
effeet in the long run between the horizontal lines drawn along 
the upper and lower edges of the target. 

Likewise, the width of the 50 per cent zone in deflection 
is 14.0 yards and the value of the probability factor is 4+ 14.0 = 
0.29, which corresponds to 16 per cent; that is, 16 per cent of 
the shots will take effect in the long run between two vertical 
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lines drawn through the lateral edges of the target. There- 
fore, 0.140.16=0.0224 is the probability of a shct taking 
effect in the rectangle formed by the two pairs of lines, or upon 
the target. That is, one shot in about 45 will in the long run 
be a hit. 


EXERCISES 


1. In shooting 10 times at a tower 3000 yards distant, 9 shots 
were observed to go to the right and one to the left. How much of a 
deviation from the target was indicated? 

2. Show that if A shots are observed to go to one side (i.e., “ short,” 
to the left, or below, etc.) of a target and B shots to the other side, 
the change to be made in the lay of the piece is the 
|A-B| 

A+B’ 


corresponding prob. error X prob. factor of 


for the given range. 

3. In shooting 15 times at a house 2500 yards distant, 13 shots 
were observed to go “over” and two “under.” How much of a 
deviation from the target vertically was indicated? 

4. In shooting at a certain small object 3000 yards distant, 9 
shots were observed to go to the right and above the target, and 1 to 
the left and below the target. What two changes to be made in the 
lay of the piece were indicated? 

5-9. Assuming that a 3-inch gun is properly laid, find the number 
of shots which would be necessary in the long run: 

(5) To drop a shell in a trench 3 yards wide and at a distance of 
3000 yards, if the trench runs perpendicularly to the line of fire. 

(6) To hit a tower 4 yards wide and 2800 yards distant. 

(7) To hit a house 10 yards high and 12 yards wide at 3000 yards. 

(8) To drop a shell in a gun emplacement or hole 12 yards long 
and 10 yards wide (measured in the direction of the line of fire) at 
2500 yards. 

(9) To sweep a canal, 15 yards wide, which runs quite a distance in 
the line of fire at 4000 yards. 

10. Show that the probability of hitting between two parallel 
lines D yards apart at a given range is given by the per cent correspond- 
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D : 
ing to the probability factor “ =-+the corresponding probable error ” 


for that range. The expression ‘“ corresponding probable error ”’ is 
used because the parallel lines may be taken in any one of at least 
three directions. 


63. Probable Errors of Other Quantities or Expressions.— 
We have been careful to refer to the probable error considered 
in the preceding sections as the probable error 7n a single observa- 
tion. There are formulas also for determining the probable 
error of values of various kinds of expressions, such as the mean, 
the standard deviation, etc., all of which depend fundamentally 
upon the formula for the probable error in a single observation. 
A few of these formulas will now be given but without deriva- 
tions. Other formulas will be introduced later. 

P. E. in the mean 


Ae 67450 
= +——__— 5 van Des" Sec oa cst 
Vn a? 
P. E. in the standard deviation 
ve 0.67450 
V2n 0) 


P. E. in the coefficient of variation v 


7 =| 1+2(s0) | igh 


P. E. in the observed probability p 


= +0.6745 


= +0.6745 Bo vy ke ee 
P. E+ in a single observation corresponding to the sum 
or difference of several independent variables 


=+0.6745Vo2+o7+ ete. . . . . (58) 


‘See Jones’ ‘‘ First Course in Statistics,” p. 158, for a derivation of 
this formula. 
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where, in each case, n denotes the number of observa- 
tions. 

As an example showing the use of formula (49), the probable 
error in the mean (4.59) of the distribution of litters of mice is 

0.6745(1.75) 
a — 

‘7 121 
whether the true value of the mean lies within the interval from 
4,48 (=4.59—0.11) to 4.70(=4.59+0.11) or without it. 

As an example showing the use of formula (52) let us con- 
sider the probability of dying within one year at age 24, or 
0.008, as given by the American Experience table of mortality 
used in connection with life insurance. If the value of this 
probability were based upon the observation of 1000 lives at 
that age the probable error in the value of the probability 
0.008 would, according to formula (52), be 


= +0.11, which shows that it is an even chance 


(0.008) (0.992) 


+0.6745 1000 ; 


or about +0.002, and it would be about an even chance whether 
the true value of the probability would be between 0.006 and 
0.010 or without that interval. If, however, the probability 
were computed from observations covering 100,000 lives at the 
given age it is easily verified that the probable error would be 
only +0.0002. 

The formulas for the probable error in a single observation 
and the probable error in an observed probability represent 
the same important guiding principle which may be expressed 
in the following form which includes a convention mentioned 
previously: If, in a random sample of n variates the proportion 
of successes is p, then the proportion of successes in the universe 
from which the sample is selected will not be likely to fall 


outside the limits ; 
p-:3 (0.6745) 4/74, 
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and if that universe contains altogether N variates the number 
of successes will not be likely to fall outside the limits 


Np+3(0.6745)N 4 [P4. 


In particular, the number of successes in another random 
sample of n variates would not be likely to fall outside the 
limits 

np-+3 (0.6745) V npq. 

We shall establish the latter relation later in a slightly 
different connection. 

Attention is called to a convention which is frequently 
(but not universally) employed in expressing the values of 
probable errors. The probable error +1.18, expressed 
4.59(+1.18), is to be interpreted as the probable error of a 
single observation and is so written only with the value of 
the mean (such as 4.59) for obvious reasons. <A probable error 
without parentheses is to be interpreted as the probable error 
of the expression which it follows. Thus, if the probable error 
of the mean 4.59 is +0.11 it may be written more compactly 
4.59-+-0.11. 

The author wishes now to refer again to formula (49) and 
to explain that the observations cited in Art. 37 are in fact 
averages of each of 36 groups of 25 digits (i.e., 0, 1, 2, ...9) 
selected at random. The complete distribution of 900 digits 
is as follows: 


Digit Frequency Digit Frequency 
0 95 5 80 
1 96 6 82 
2 93 7 72 
3 105 8 90 
4 91 9 96 


It is easily verified that for this distribution M=4.38 
and o=2.91. Hence, according to formula (49) we should not 
expect any one of the 36 means cited above to vary from 4.38 
by more than 3(0.6745)2.91+V25 or 1.18. We should not 
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then expect any one of the means to lie outside of the interval 
from 3.20 to 5.56. It is evident that only one lies outside this 
interval. It is easily verified that no one of the means lies out- 
side of the interval determined by the maximum error. 


EXERCISES 


1. The distribution of workers in two districts of England in a 
certain year was as follows: 
Workers Over Workers at 
35 Years Old All Ages 
IDistricueAC racer 11,718 35,316 
Districh > Aerie: 4,029 21,822 


Determine whether the difference in the proportion of workers 
over 35 years of age for the two districts is more than one would 
expect in random sampling. Suggestion: Compute p the ratio of all 
workers over 35 to all workers and the values of the standard devia- 
tions of this ratio for each district and then employ formula (53). 

2. Solve the preceding exercise in the same general way but use 
different values of p (i.e., 0.3832 and 0.185) in computing the two 
values of the standard deviation. What is the difference between 
the two assumptions used in the two methods of computation? 

3. Investigations have been made to find whether there is any 
significant difference in the size of eggs of cuckoos laid in general and 
laid in nests of certain species of foster parents. The results are as 


follows: 
Mean 
Eggs laid Number Length o 
(m.m.) 
generally 1572 22.3 0.96 
in nests of 
Garden Warbler.. 91 21.9 0.79 
White Wagtail... 115 22.4 0.76 
Hedge Sparrow... 58 22.6 0.86 


Determine whether there is any significant difference in each of 
the three situations. Compute the mean errors of the means and 
employ formula (53). 
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4. The following results were obtained in an investigation into 
any significant differences in the males of a certain species of crab 
when found in deep water and when found in shallow water. 


Mean Carapace 


Length (mm.) o v 
Deep -wateren cen veces sees 8.59+0.05 1.67+0.04 19.45+0.44 
Shallows waters sass <> 8.41+0.04 1.49+0.03 17.75+0.37 


Determine any significant difference. 

Ans. Means 0.18+0.07 Possibly. 
¢ 0.18+0.05 Probably. 
vy 1.70+0.58 Possibly. 

5. The standard deviation of the head-lengths of 3000 criminals 
was found to be 6.04593+0.05265 (mm.), and of 1306 of these criminals 
selected at random 6.00247+0.07922 (mm.). Determine whether 
the difference is significant. If the difference were not significant 
would this fact prove the whole group to be homogeneous? If no 
significant differences in the means and all other expressions were 
found, what could be concluded? 

6. The standard deviations of the lengths and breadths of 139 
skulls of the Naqada race, excavated in Upper Egypt and believed to 
be some 8000 years old, were found to be 5.722 and 4.612 mm., respect- 
ively, and the same for 1000 Cambridge undergraduates were 6.161 
and 5.055 respectively. Are the differences significant? 

7. The heights in inches of groups of fathers, mothers, sons and 
daughters were found to be as follows: 


Fathers Mothers Sons Daughters 
Mean... 67.68-+0.06 62.48-+0.05 68 .65+0.05 63 .87+40.05 
o ... 2.70+0.04 2.39+0.04 2.71+0.04 2.610.038 
v ... 93,99+0.06 3.83+40.06 3.95+0.06 4.09+0.05 


(a) Determine the number of fathers, mothers, sons and 
daughters. 
(b) Check the values of the probable errors. 

8. A random sample of 90 was selected from 514 candidates in a 
certain examination where the grades ranged from 3 to 64. The 
grades by groups and the percentages of the whole group and of the 
sample making these grades were found to be as follows: 
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Percentages 
Grades Total Sample 
ee LA 8 8+1.9 
15-24 19 17+2.6 
25-29 16 18+2.7 
30-34 18 1342.4 
35-39 15 17+2.6 
40-49 19 18+2.7 
50 see 7 10+2.1 


Check the probable errors in the last column. Are there any 
evidences of special lack of homogeneity in the whole group? 

9. A sample of 60 towns was selected at random from 241 towns 
in England and Wales, and the number of infectious diseases per 
thousand of population for the sample and for the whole group of 
towns was tabulated to give the following results: 


Rate per Frequencies 
Thousand All Towns Sample 
1- 4 85 92+10 
5- 8 86 96+10 
9-12 42 2847 
Bs 0.6 28 24+6 


Check the probable errors. Point out any evidences of special 
lack of homogeneity in the whole group. What is the practical 
importance of examples like this and the one given in the preceding 
exercise? 

10. An attempt was made to predict the annual output per head 
(in pounds sterling) in 142 different types of employment for 1907 in 
the United Kingdom, from an analysis of the output of 50 different 
occupations selected at random to give the following results: 


Number of Occupations 


Output Garanle Predicted Actual 
per Head for Total Number 
aE 59 4 1143.6 12 
60— 79 16 45+6.2 42 
80- 99 6 17+4.3 25 
100-119 10 28+5.3 20 
120-189 8 23+4.9 27 


UDA oc 6 17+4.3 16 
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Check the probable errors. Note any special failures in prediction. 

Nore.—The method followed in obtaining the random samples 
cited in the above examples is simple but irksome and should perhaps 
be explained. The total number of observations was ranked and 
numbered in each case, and then a sample of these numbers was selected 
by forming numbers out of the digits found, say, in the seventh decimal 
place of successive groups of logarithms. Thus, to obtain a sample 
of 50 out of 142, the seventh-place digits of each successive group of 
three logarithms would be arranged in the order of their occurrence 
and every number so obtained which exceeded 142 was ignored. The 
first 50 numbers retained would serve to identify the sample. 


CHAPTER IX 
THE BINOMIAL (p+ q)”. STATISTICAL SERIES 


64. Asymmetrical Curves.—As may be readily inferred, 
there are many curves which are like the normal curve in that 
they have at most one mode, but which differ from the normal 
curve in that they are not symmetrical. Such curves are often 
referred to as skew curves. Most of these types of curves are 
represented by the various graphs obtained by plotting the 
terms of the expansion of the binomial (p+q)", where p 
represents the probability of the occurrence of a certain event, 
q the probability of its failure and n the number of trials. 
Lest one should think that the graphs of the terms of the 
expansion would be rectangular histograms, it is well to say 
that it is the frequency curves which correspond to these 
histograms to which we refer as representative types of 
frequency curves. When, as a special case, p=q= 4, the cor- 
responding graph is symmetrical—that is, skewness is absent 
or zero—and is essentially a form of the normal curve. 

Let us consider again a few concrete illustrations of such 
an expansion and acquire greater familiarity with the signifi- 
cance of the terms of the expansion. If n coins were tossed up, 
the first term of the following expansion 


G+4)"= Gna 1G@) tM Veaye-2ayet ete, 


is the probability that all coins will be ‘“ heads.” Similiarly, 

the second term is the probability that all but one will be 

“heads”; since (3)""‘(4) is the probability that one par- 

ticular coin will be “ tails ’’ and the rest ‘‘ heads,’”’ and this one 
167 
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coin can be chosen in n ways. Likewise, the third term is the 
probability that all but two will be “ heads,” and so on for the 
rest of the terms. The probability of obtaining exactly 
r “tails” with n coins is then ,C;(3)""(4)’, where (4)" "(3)" 
is the probability that one particular set of r coins of the 
total n coins will be ‘‘ tails’ and the rest “ heads,” and ,C, 
is the number of ways these r coins can be selected from n 
coins; that is, ,C; is the number of combinations of n things 
taken r at a time. 

Likewise, the probabilities of throwing no ace, of throwing 
exactly one ace, exactly two aces, etc., in throwing n dice are 
given by the successive terms of the expansion 


$-4+4)"= (¥)"-+-n(8)" "(H+ ete. 


It should be emphasized that the terms of these expansions 
represent probabilities and that their sum is unity. If the 
terms of such expansions were multiplied all the way through 
by a suitable number, the various terms would obviously 
represent probable frequencies. Thus, the terms of the 
expansion 


64(24+1)4=44164244 1644, 


represent the number of times the various possibilities “ four 
heads,” “ three heads and one tail,” ete., would be most likely 
to occur in the long run in tossing 4 coins 64 times. No one 
would, of course, expect to obtain these frequencies in a single 
experience, although he might come very close to it; the results 
of an actual experience were 5+15+24+15-+5. 


EXERCISES 


Compute the theoretical frequencies corresponding to the following 
distributions: 

1. Three dice were thrown 648 times and the number of times a 
“5 or 6” appeared was tabulated as follows: 
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Number Frequency 
0 179 
1 298 
2 141 
3 30 


How many times did a “5 or 6” appear? How many times was 
it possible for a “5 or 6” to appear? 

2. Balls were drawn, one at a time, from a bag containing an equal 
number of black and white balls, each ball being returned before the 
next drawing. The number of black balls drawn was then tabulated 
for each consecutive seven drawings as follows: 


Number Frequency Number Frequency 
0 9 4 148 
1 34 5 95 
2 104 6 40 
3 151 uf 4 


3-5. Balls were drawn, one at a time, from a bag containing an 
equal number of black and white balls. The number of black balls 
was then tabulated for each consecutive (3) five drawings, (4) six 
drawings and (5) seven drawings, to give the following distributions: 


Number of 

Black Balls (3) (4) (5) 
0 30 17 9 
1 125 65 34 
2 PHO 166 104 
3 224 192 isl 
4 136 166 148 
5 27 69 95 
6 8 40 
i 4 


Determine how many black balls were drawn and how many it 


was possible to draw in each set. 
6. Three dice were thrown 196 times and the sum of their upper 


faces was tabulated to give the following distribution: 
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pies Frequency 

+ 1 Ans. 1 
5 + 3 
6 11 5 
7 10 9 
8 24 14 
9 22 19 

10 22 23 

11 32 24 

12 17; 24 

13 23 23 

14 9 etc 

15 7 

16 th 

17 4 

18 3 


7. A coin was tossed 2048 times and every time a “ head ” appeared 
a new record was made to show in which toss “ head ” appeared for 
the first time. ‘‘ Head ” appeared for the first time in the 


Ist toss 1061 times 
2nd. |: 494 ‘* 
tgol eG DBD =p 
4th (ey) OS 
Sth: a 5604 
Gina [Oe 
Vato et 25) ess 
Sth e Sa SS 
Oth ae 6 oe 


8. Twelve dice were thrown 4096 times and the number of times a 
“6” appeared was tabulated for each throw to give the distribution: 


Number Frequency Number Frequency 
0 447 5 115 
1 1145 6 24 
2 1181 7 7 
3 796 8 1 
4 380 


9. The appearence of a “4, 5 or 6”’ was tabulated for 4096 throws 
of 12 dice to give the distribution: 


EXAMPLES OF DISTRIBUTIONS il 


Frequency 

0 

a 

60 

198 
430 
731 
948 


Number Frequency 
7 847 
8 536 
9 257 
10 71 
11 11 
12 0 


10. The following distribution gives the number of trumps held 
by the first hand in 25,000 deals at whist: 


Frequency 


215 
1724 
5262 
7440 
6371 


Number 
of Frequency 
Trumps 
5 2950 
6 852 
a 166 
8 20 


Show that the number of deals which would be necessary, in the 
long run, to yield a hand consisting entirely of trumps would be almost 


seventeen million. 
11. Fourteen coins were tossed 150 times and the number of heads 
and corresponding frequencies was tabulated to give: 


Number 


NOOO WH 


Frequency 
2 
15 
17 
27 
36 


Number Frequency 
8 25 
9 15 
10 6 
11 6 
12 0 
13 1 


12. Eight dice were thrown 6561 times, and the number of times a 
“5 or 6” appeared was tabulated to give the distribution: 


Number 


PRwWN FE © 


Frequency 
256 
1024 
1792 
1792 
1120 


Number Frequency 
5 448 
6 112 
7 16 
8 1 
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13. The number of occurrences of any one of 0, 1, 2, 3 or 4 in the 
seventh decimal place was tabulated for each of 300 groups of 50 
logarithms to give the distribution: 


Number Frequency Number Frequency 
14 1 25 42 
15 0 26 36 
16 3 27 30 
17 2 28 28 
18 3 29 15 
19 if 30 16 
20 9 31 5 
21 18 32 2 
22 26 33 2 
23 21 34 1 
24 32 35 1 


14. Eleven dice were thrown 26,306 times and the number of times 
a ‘5 or 6” appeared was tabulated to give the distribution: 


Number Frequency Number Frequency 
0 185 6 3067 
1 1149 i 1331 
2 8265 8 403 
3 5475 9 105 
4 6114 10 14 
5 5194 11 4 


15. Twelve dice were thrown 10,596 times and the number of 
times a ‘4, 5 or 6” appeared was tabulated to give the distribution: 


Number Frequency Number Frequency 

0 1 7 2198 
1 21 8 1380 
2 163 9 648 
3 500 10 188 
f 1141 ila! 32 
5 1962 12 3 
6 2359 


65. The Probable Error in a Single Observation and the 
Expansion of (p+q)".—The terms of the expansion (p+q)” 
form the basis of interesting investigations. If we expand 
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the binomial in the order indicated by (q+p)” and multiply 
the first term q" by 0, the second term ng” 'p by 1, the third 
term by 2, and so on, the sum of all such products is the first 
moment about q"; and since (¢+p)"=1, the moment is a unit 
moment. Hence, 


NUE e 


vi =nqg"—p+n(n—1)q"-2p?-+ 3598 | fe fe! +np* 


2 
=np{q?-*+(n—1)q"-*pt+ ... +(n—1)qp?-?+p"-1} 
=np(q+p)*~* 
=np. 


It is left for the student to show tnat the second unit 
moment about q” is 


vy,’ =npt+n(n—1)p?. 
But the second unit moment about the mean or 
vo= v2 — (v1)? 
=np+n(n—1)p?—n?p? =np—np? =np(1—p) 
=npq. 
Therefore, the standard deviation or 
C=V NIG. & 2 64 2 ou gen (04) 
and the probable error in a single observation 
=0.6745V npg... . . . . (55) 
Let us consider the significance of these results. Since 
v,/(=np) is the distance from the term g” to the mean, 
the np-th term measured from q” is at the mean and if we 
regard the mean or arithmetic average as the most probable 


value, the np-th term measured from q” represents the term of 
greatest probability! Now the np-th term measured from 


1 As a matter of fact, it can be shown that the value of this term is the 
greatest. See Fisher’s ‘‘Mathematical Theory of Probabilities,” p. 100. 


174 THE BINOMIAL (p+q)". STATISTICAL SERIES 


q” represents the probability that the event under considera- 
tion will happen exactly np times in n trials; we conclude, 
then, that np is the most probable number of times that the 
event will occur in n trials. This is, of course, what we should 
expect; for example, if we toss up 400 coins the most probable 
number of “ heads” is np =400-4=200. It is fundamentally 
important to think of np as the most probable number and not 
as the corresponding probability (that is, as an abscissa and 
not as an ordinate). 

The probable error, say k(=0.6745Vnpq), is the diver- 
gence (an abscissa) on either side of the most probable number 
(the mean) such that the frequency or area between x= —k 
and x=+k is approximately (exactly if the distribution were 
truly normal) half of the total frequency or area; and the 
probability of an observation chosen at random falling within 
the interval is approximately the same as that of its falling 
without the interval. In the illustration given above (of 
tossing 400 coins) the probable error is 0.6745 4004-3 or 
about 7; therefore, a deviation of more than three times 7 
from 200 “ heads” is not to be expected if attendant circum- 
stances are normal. A deviation of 50 from 200 “ heads ”’ 
would practically establish the existence of abnormal con- 
ditions, such as “ influenced tossing,’ a ‘ loaded coin,” ete. 
If the value of p used in such an investigation were based solely 
upon experience (such as a death rate) a deviation of more 
than three times the probable error would lead naturally to 
the conclusion that the value of p used in the investigation 
was not representative. As a final illustration, suppose that 
it had been found by experience that the ratio of male children 
born to female children born was 1050/1000; in other words, 
the probability of a child being male is 1050/2050. If 5135 
out of 10,000 children proved to be males, in a certain com- 
munity, what conclusions could be drawn from the devia- 
tion from the expected number? It is easily verified that the 
expected number is 5122 and that the probable error is 33.7. 
The deviation of 13 is then well within the value of the prob- 
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able error and is to be expected. A divergence of, say 300 
would, however, show that the given ratio did not fit the given 
situation. 


EXERCISES 


1. Determine whether the following actually observed results 
were to be expected: 

‘(a) 2048 “ heads ” in 4040 throws of a coin; 

(b) 4096 “ heads ” in 8132 throws of a coin; 
(c) 10,353 “ heads ” in 20,480 throws of a coin; 

/(d) 2030 black balls in 4096 drawings of a single ball 
from an urn containing black balls and white balls in equal 
proportions; ; 

(e) same as (d) but 8120 black balls in 16,384 drawings; 

‘(f) 81,236 trumps in 325,000 deals of a single card; 

(g) 39,756 appearances of a “4, 5 or a 6” in 78,000 
throws of a single die; 

(h) 105,602 appearances of a “5 or 6” in 289,366 throws 
of a single die; 

/(t) 7513 appearances of a “0, 1, 2, 3 or 4” in a certain 
decimal place in 15,000 logarithms; 

(7) 64,988 appearances of a “4, 5 or 6” in 127,152 throws 
of a single die; 

(k) 670 appearances of a “4 or 6” in 1944 throws of a 
single die; 

(1) 7440 holdings of three trumps in 25,000 deals at 
whist; 

(m) in 32 times out of 196 throws of three dice the sum of 
the upper faces were 11; 

(n) a “5 or 6” appeared twice 141 times in 648 throws 
of three dice; 

(0) a “6” failed to appear 447 times in 4096 throws of 
12 dice. 
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/ 2. The number of births by sexes in Denmark for the designated 
quinquennial periods were as follows: 


Female Male 
1860-4 130,089 138,289 
1865-9 135,324 142,828 
1870-4 139,733 148,360 
1875-9 154,214 162,823 


Determine whether the deviations of the number of males from the 
number determined by the average ratio of males to females are to be 
expected. 

3. The number of twins were found for each of five genealogical 


records to be as follows: 
Number 


of iris Twins 
1 4184 u 
2 4116 32 
3 4147 nT 
4 3491 25 
5 4744 8 


Determine the empirical probability of a birth resulting in twins, 
and whether any number of twins given above is unexpected. 

4. The number of accidents resulting in permanent disability, 
and the number of fatal accidents, for various countries for specified 
periods were as follows: 

Permanent 


Disability holes 
Austria (1897-1906)....... 82,446 8,349 
Belgium (1905-1908)....... 8,204 1,838 
Denmark (1899-1906)....... 4,192 389 
France (1899-1908)....... 140,877 18,708 
sermany (1899-1908)....... 313,219 59,893 
Italy (1898-1902) en... 9,701 2,224 
Norway (1895-1905)....... 4,496 832 
Russia (1904-1906)....... 34,981 2,345 


Compute the average ratio of the number of fatal acidents to the 
number of accidents resulting in permanent disability, and find 
whether the relative number of fatal accidents in the case of either 
country listed above is unexpected. 
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5. Assuming the presence of no abnormal circumstances, determine 
the most probable number of occurrences of the events designated in 
the following examples and the maximum deviation therefrom to be 
expected (i.e., the maximum probable error): 

(a) the number of heads in tossing a coin 10,000 times; 

(6) the number of aces in 10,000 throws of a single die; 

(c) the number of shots taking effect in the first quadrant 
in shooting 10,000 times at a target consisting of the origin of 
a pair of rectangular axes; 

(d) same as (c) except that the axes are oblique and at an 
angle of 60°; 

(e) the number of aces in 1600 drawings of a single card | 
from a pack of cards; 

(f) the number of times a marble falls to one side of a 
line used as a target in dropping the marble 5000 times; 

(g) number of times a red ball is drawn from a bag con- 
taining 5 balls, each of a different color, in 10,000 drawings of 
a single ball; 

(h) same as (g) except that the 5 balls consist of 2 reds 
and 3 blacks; 

(¢) the number of deaths per annum out of 100,000 
individuals of a given age, where the death rate for that age 
is known to be 0.00864; 

(j) the number of male births out of a total of 100,000 
births, assuming the ratio of male births to female births to 
be 1050 : 1000; 

(k) the number of deaths in a general population of 
1,000,000, where the death rate is known to be 14 per thousand. 

6. If the probability of the occurrence of a certain event in a single 
trial is p, show that the probability that the event will occur np times 


in n trials is 
n! ae 
(np) (ng)? t 
7. The following formula is known as Stirling’s formula and is 
employed to obtain approximations of factorials of high order: 


nl=nthe-"V 2. 


178 THE BINOMIAL (p+q)”. STATISTICAL SERIES 


Show that the value of the probability given in the preceding 
exercise may be written 
1 


V 21 “pq ; 
8. It has been found that the average deviation of a distribution 
with a constant probability as a base (called a Bernoullian distribution) 
is 


n! 
‘ 1 
on —— Nee 


(np)! (nq) ! 


ng+1 


q 


Show that this expression may be written 


wo npqd=a 
Tv 


9. Show that the third unit moment of the expansion of (q+ p)” 
about the mean is npq(q—p). 

10. Show that the fourth unit moment of the expansion of (¢+p)” 
about the mean is npq{q?+(8n—4)pqtp?}. 


.S] 
ee 


66. Bernoullian, Poisson and Lexian Series.—Hereto- 
fore we have given no consideration to the possibilities of 
breaking up a group of observations into several sub-groups 
or sets for individual investigation and comparison. Such a 
consideration will lead to some very important conclusions. 
Let us think of a large number of observations as classified into 
N sets of equal size and consider the three following situations: 

A. Suppose that the probability of the occurrence of a 
certain event remains constant during each and all of the N 
sets. Suppose that the number of times the event occurs 
in the first set is 71; in the second set r2; and so on for all the 
sets. Then the series of absolute frequencies thus obtained 
is called a Bernoullian series. 

B. Suppose that the probability varies from trial to trial 
within each set, but that the values and the variations are 


the same for all sets. The corresponding series of frequencies 
is called a Poisson series. 
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C. Suppose that the probability remains constant within 
each set but varies from set to set. The corresponding series of 
frequencies is called a Lexian series. 

The distinctions made above may be illustrated as follows: 
Suppose that each of N bags contains black and white balls 
in the same proportion, and that n balls are drawn, one at a 
time, from each of the bags and the color noted, each ball 
being returned before a new drawing is made. If the number 
of black balls drawn from the first bag is 71, the number from 
the second bag is rz, and so on, the number sequence 


Wile, Uy U5 0 6. oli 
is a Bernoullian series. 

Suppose, however, that the proportion of black and white 
balls is changed each time a ball is drawn in each set, but that 
the program is exactly the same for each set; the number 
sequence of black balls drawn is then a Poisson series. 

If the proportion of black and white balls is the same 
throughout each set but is altered from set to set, the number 
sequence of black balls drawn is a Lexian series. 

An analysis and comparison of the means and the disper- 
sions of the three types of series just considered lead to 
important results. These results will be developed in the fol- 
lowing sections. 

67. Dispersion of Ratios.—The observing student will soon 
become familiar with the fact that the results of most sta- 
tistical investigations are better expressed in the form of ratios. 
Final results generally mean more if so expressed. It will 
be recalled that the values of moments mean little unless the 
moments are unit moments, and that the method suggested 
for computing unit moments is equivalent to dividing the 
frequencies of a distribution by the total population and 
working with the ratios. Let us consider, then, the means 
and the dispersions of the three types of series defined above 
where, however, we deal with ratios. 

We shall let ~, denote either the ratio or probability corre- 
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sponding the i-th observation of each set of a Poisson series, or 
the ratio or probability corresponding to the 7-th set of a 
Lexian series. We shall then denote the arithmetic average 
of these ratios or probabilities in the case of either type of series 
by p. 

A. Bernoullian Series.—As the basic probability p remains 
the same, not only from observation to observation but also 
from set to set, the square of the dispersion for each set, and 
therefore for all sets as a whole, must be npg. Denoting the 
dispersion of a Bernoullian series by cg, we have 


Te =NDG) c-- oR Hs 


B. Poisson Series.—As the basic probability changes from 
observation to observation but the program is the same for all 
sets, the square of the dispersion is the same for each set and 
hence for all sets. The square of the dispersion for the 7-th 
observation of each set is pig; and, according to formula (53), 
the square of the dispersion of the whole set is the sum of all 
such expressions or Yp.q; from i=1 to 7=n inclusive. But 


Di=pt+(pi—p), 


Qi=Q—(pi-—p), 
Hence 


PiQi= d— (pi-— P) (p—Q) — (pi—-)”, 


whence it is easily verified that 


Upiqi=npq— =(pi-p)? 


Denoting the dispersion of a Poisson series by sp we have 
then 
Reo sy 9 
Tp = op — 2 Dep) 1, he eee Oe 
where the summation is to extend from 7=1 to 7=n inclusive. 
C. Lexian Series.—Since the basie probability for the i-th 
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set 1s p:, the square of the dispersion about the mean np; of 
that set, for that set, is npiqi. But we wish to take our origin 
at np, the mean of all the N sets. Hence, reversing the usual 
process (i.e., moving the origin away from instead of to the 
mean), we find that the square of the dispersion of the 7-th 
set (about the mean of all the sets) is npqit(npi—np)?. A 
reasonable estimate for the square of the dispersion for all N 
sets is then the arithmetic average of the values of the expres- 
sion just found for all the sets. Denoting the dispersion of a 
Lexian series by oz, we find that 


n pe 
ah NePat (Die) 
where the summation is to extend from 7=1 to 7=WN inclusive. 
But it is evident, from a similar summation considered in 
connection with the dispersion of the Poisson series, that 
2piqi= Npq—=(pi- Pp). 


Hence, we have finally 


ay 
oh =08+—y X(p,—py. . . . « (58) 
where, to repeat, the summation is to extend from 7=1 to 


i=N inclusive. 
Let us consider the means. The mean of a Bernoullian 


series is evidently 
Ma cet (Oy 
Also, since ; 
p=—(pitpet ..- +Pr); 


or 1 
yy iPit Pet see +p), 
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according as the series is Poisson or Lexian, the mean of a 
Poisson series, which is the same as the mean of each set, 1s 


npitnpot ... +NDPn 
n 


Ni= 


=np, 
and the mean of a Lexian series, which is the average of the 
means of the various sets, is 


npitnpet+t ... +npy 
Ne= N : 


=np. 


We find, then, that the mean of a Poisson series is the same 
as the mean of a Bernoullian series and that the mean of a 
Lexian series is the same as the mean of a Bernoullian series 
whenever the probability p of the Bernoullian series is the aver- 
age of the probabilities of the other type of series. 

The important fact to be noted, however, is that the dis- 
persion of a Bernoullian series is less than that of the corre- 
sponding Lexian series and greater than that of the correspond- 
ing Poisson series. This fact is evident from a mere inspection 
and comparison of the formulas for the dispersions of Poisson 
and Lexian series, but the fact is so important that the student 
should verify the statement to his complete satisfaction. 

68. Numerical Examples of Poisson and Lexian Series.— 
We shall now consider two numerical examples of series which 
should serve to add familiarity with the processes indicated 
by the formulas derived in the preceding section. 

As an example of a Poisson series, let us consider the fol- 
lowing series or distribution, which was obtained by making 
100 sets of 9 drawings of a single ball from a bag containing 
black and white balls. In each set the first drawing was made 
when there were 9 black and 1 white balls in the bag, the 
second drawing when the proportion was 8 to 2, the third 
7 to 3, ete.; the proportion in the last drawing was 1 to 9. 
The number of black balls drawn in each successive 9 drawings 
was then tabulated for all 100 sets, where, of course, the pro- 
gram was the same for each of the sets. 
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Poisson Series—Number (m) of black balls and the fre- 
quencies (y) of these numbers in 100 sample sets of 9 drawings. 


n=9, N=100. 

m y x xy vy 
1 2 —4 = 8) 32 
2 4 —=3 12 36 
3 13 =} — 26 52 
4 26 =i —26 26 
5 32 0 
6 Alf 1 es 17 
7 5 2 10 20 
8 1 3 3 9 

100 —42 192 


The correction to the mean is then —0.420 and the mean 
is at M =5.000—0.420 =4.580-+0.089, 


where the value of the probable error is added to show 
how much of a deviation would naturally be expected as a 
result of chance. 


Likewise a”? =1.920—(0.420)?, 
and ¢ =1,.320+0.062. 
Referring now to the formulas of the preceding section, we have 
=$Gotist ... try) =3. 
Hence, q=% and np=9-3=4.5. 


Therefore, Mp=Mz=4.500, 


which evidently checks with 4.580+0.089, found directly 
from the observed data. 
It is easily verified that the value of 2(p;—>p)? is (a°o — 25)? + 


(85-75)? + ete., or 0.600. 
Hence, according to formula (57), 


02 = 2.250 —0.600 = 1.650, 
and op = 1.285, 
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which checks satisfactorily with 1.320+0.062 found directly 
from the observed data. Also 

on=V9-4-4=1.500, 
which is obviously greater than the value of op. 

We shall comment later at considerable length on the 
extreme rarity, in practice, of pure examples of any of the types 
of series under consideration. In order to prepare the student 
to expect departures from the rigid definitions of the three 
types of series, we shall consider as an example of a Lexian 
series one wherein the basic probability does remain the same 
throughout each set and does vary from set to set in some 
but not in all cases. The general plan of procedure will be 
exactly the same as if the basic probabitiity were different for 
each set. The following distribution was obtained by making 
90 sets of 10 drawings of a single ball from a bag containing 
black and white balls, and tabulating the number of black 
balls corresponding to each successive 10 drawings. The 
proportion of black balls to white balls throughout the first 
10 sets of 10 drawings was 9 to 1; throughout the second 10 
sets, 8 to 2; and so on until the last 10 sets, when the propor- 
tion was 1 to 9. 

Lexian Series—Number (m) of black balls and the fre- 
quencies (y) of these numbers in 90 sets of 10 drawings of a 
single ball. n=10, N=90. 


m Yy x xy xy 
0 6 —5 —30 150 
1 6 —4 —24 96 
2 7 —3 —21 63 
3 9 —2 —18 36 
4 8 —] — 8 8 
5 12 0 
6 10 1 10 10 
ve 14 2 28 56 
8 5 3 15 45 
9 9 4 36 144 

10 4 5 20 100 
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The correction to the mean is then - =(.089 and the mean is 


M =5.000-+0.089 = 5.089-+0.132. 
Likewise 
o2=708/90— (0.089)2, 
or 
o=2.8038+0.209. 


Let us see how these results compare with the results to be 
obtained by the formulas of the preceding section. 

pO or LOeas ese +10-75 

90 i 


tole 
4 


Hence, 
q=% and np=10-4=5. 
Therefore, 
M, = Mp — Is . 000 


which evidently checks with 5.089--0.132 found from the 
observed data. 


Also 
op= Vnpq=3V10=1.581. 
Since 
2 (pi— p)? = 10 (4% ts)” MW) Cees 5)? + etc., 
== ,000) 
and 


n?—n 100—10 
Non 00 = 
we have, by formula (58) 
o?=2.500+6.000=8.500, 


I, 


or 
op=2.915, 

which checks satisfactorily with 2.803+-0.209 found from the 

observed data. It is to be noted that oz is greater than oz. 

The student is urged to set up examples of his own similar 
to those given above. It should be noticed in that connection 
that, if the records of the observations are kept properly, the 
data may be grouped to give either a Lexian or a Poisson series 
and one experiment he made to serve for two. 
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EXERCISES 

Compute and compare the observed and the theoretical values of 
the mean and the dispersion of the following series.2 Satisfactory 
comparison can be made only if the probable errors of the observed 
values are also computed. 

1. One hundred sets of 100 drawings of a single ball from a bag 
containing an equal number of black and white balls were made, and 
the number of black balls drawn in each set was tabulated to give the 
following distribution: 


Number of F , 
Black Balls requencies 
34 1 
ou 0) 
40 iF 
43 8 
46 15 
49 25 
52 19 
Do 16 
58 8 M =50.1+0.54 o=5.33+0.38 
61 2 
64 1 Mz=50.0 ogp=5.00 


2. One thousand sets of 10 drawings of a single card from a pack 
of cards were made, and the number of black cards drawn in each set 
was tabulated to give the following distribution: 


Number of P 4 
requencies 
Black Cards q 


34 M=4,93+0.05 o=1.55+0.04 


eomaNTNOoaArRWNFH OC 
bo 
Pa 
= 


10 0 Mz=5.00 op=1.58 


? Most of these exercises were taken from Arne Fisher’s ‘Mathematical 
Theory of Probabilities.’’ His answers are given also. 
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3. One hundred sets of 10 drawings of a single card were made 
from a pack of cards and the number of black cards drawn in each set 
was tabulated to give the following distribution. During the first 
ten sets the proportion of black cards to red cards was 26 to 26; during 
the second ten sets, 25 to 27, etc. 


Number of F ‘ 
Black Cards es aan 
4 
9 
19 
21 M =4.38+0.17 o=1.67+40.12 
23 
10 M,=4.14 op=1.64 
1 
2 Mp=4.14 Cio 


DBNAaAnrkwWNH 


4, One hundred sets of 27 drawings of a single card were made 
from a pack of cards, and the number of black cards drawn in each 
set was tabulated to give the following distribution. In each set a 
black card was replaced by a red card from another pack after each 
drawing, so that in each set the proportion of black cards to red cards 
were 26 to 26, 25 to 27,...0 to 52 in the respective order of the 


drawings. 


Number of 


ela Cards Frequencies 

3 9 

4 6 

5 14 

6 14 

a 22 

8 ily 

9 14 M=7.16+0.21 o=1.94+0.15 
10 8 
11 1 Mp=6.75 a4 ill 
12 1 

13 1 Mp,=6.75 Cpe) 


5. Twenty sets of 500 single drawings of a card from a pack of 
cards were made and the number of diamonds drawn in each set was 


tabulated to give the following results: 
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109, 
123, 
130, 
136, 
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116, 117, 
124, 124, 
132, 138, 
138, 139, 


121, 
129, 
135, 
142, 


122, 
130, 
135, 
143. 


M =128.9+2.01 


Mp=125.0 
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o=8.96+1.42 


op =9.68 


6. Twenty sets of 500 drawings of a single ball from an urn were 
made, and the number of black balls drawn in each set was tabulated. 
The proportion of black balls to white balls varied from set to set, 
20 to 20, 19 to 21,...1 to 39. The following distribution was 
obtained. 


251 
246 
222 
216 
193 


176 
183 
173 
156 
135 


140 
127 
115 
96 
78 


69 
55 
43 
29 
18, 


M=136.6+15.9 


M,=131.3 


Mz=131.3 


o0=70.1+11.1 
Or — fat 


oB= 9.8 


7. One hundred sets of 100 drawings were made from a pack of 
cards, and the number of aces drawn in each set was tabulated to give 


the following distribution: 


Number of 


Aces 


Frequencies 


— eRe bo 


RF OONNFOWMNWRrONWOWOe 


M=7.45+0.28 


Mp=7.69 


o =2.79+0.20 


op=2.66 


8. Five hundred sets of 20 drawings of a single ball from an urn 
were made, and the number of black balls drawn in each set was 
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tabulated to give the following distribution. In each set the pro- 
portion of black balls to white balls was 20 to 20, 19 to 21, ete., in the 
respective order of the drawings. Compute the necessary probable 
errors. 


Number of 


Black Balls Frequencies 


69 M=5.14 og =1.93 


16 Mp=5.25 op=1.86 


BOO ANanrPrtwWNnNnrHO 
ie) 
Or 


ee 


1 Mp=5.25 op=1.97 


9. Ten cards were drawn 100 times from a pack of 52 cards, and 
the number of “ black ” cards tabulated for each drawing; and this 
procedure was repeated 26 times. After each set of 100 drawings a 
black card was replaced by a red card from another pack, so that the 
ratios of ‘ black” to ‘‘reds”’ in the successive sets were as follows: 
26 : 26, 25:27, 24:28, etc. The following total frequencies were 
obtained: 


Number of : 
ila cles Frequencies 
406 
464 


454 


63 M =2.625+0.038 o=1.938+0.018 


OONonrPwWwWNFH OC 
iw) 
Or 
Ke) 


5 Mp=2.596 op=1.386 
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69. Practical Applications——We are now ready for some 
practical applications of our knowledge of the relative values 
of the dispersions of the different types of series Just con- 
sidered. In the vast majority of problems occurring in 
practice, the values of the basic probabilities will be unknown 
and must be determined empirically. Moreover, observations 
will appear in practice in one large group and not in well- 
defined sets or sub-groups. What we shall do is to determine 
the basic probability and the dispersion, on the assumption 
that the given group of observations constitutes a Ber- 
noullian series. We shall refer to the value of the dispersion 
so found as the value of the hypothetical dispersion, and denote 
it by og. The value of the hypothetical dispersion can then 
be compared with the value of the dispersion computed directly 
from the observations. As an example, let us refer to the 
numerical example of a Lexian series considered in the pre- 
ceding section, but let us suppose that we know nothing about 
the proportion of black and white balls in any drawing. To 
compute the value of the hypothetical dispersion we solve 
the equation np=M for p, where M, the mean, is found by 
computation to be 5.089 and n=10, and obtain p=0.5089. 
Hence, the value of the hypothetical dispersion og =Vnpq= 
V10(0.5089) (0.4911) =1.581. The value of the dispersion 
computed directly from the observations has already been 
found to be 2.803-+0.209, and the discrepancy between the 
two values of the dispersion indicates clearly the character of 
the given distribution or series. The numerical example 
of a Poisson series considered in the preceding section can 
be analyzed in the same way. It is easily verified that 
p=4.580+9=0.5089 and that op=~9(0.5089) (0.4911) =1.500. 
The value of the dispersion computed directly from the 
observations 1.820+0.093 compared with og=1.500 indicates 
clearly the character of that series. 

We shall refer frequently to the dispersions of Bernoullian, 
Poisson and Lexian series—and sometimes the series them- 
selves—as normal, subnormal and hypernormal respectively. 
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The dispersion in the first example considered above is then 
hypernormal and that of the second is subnormal. 

It should be obvious that practically all of the problems 
which will arise in practice will differ from similar problems 
connected with games of chance, in that more factors will enter 
in and the basic conditions can not be controlled so easily. 
Statistical series will rarely conform, then, to either of the 
precise definitions of series given previously; the basic proba- 
bility will probably be changing from observation to obser- 
vation, as well as from set to set, throughout the investigation, 
so that the fundamental problem to be considered in practice 
is that of determining whether the ultimate effect of these changes 
is greater from set to set or from observation to observation 
within each set. 

Let us ignore for the present any inequalities between what 
would correspond to the sizes of the various sets, and con- 
sider the number of passengers killed (m) on railroads in the 
United States in the decade from 1911 to 1920 inclusive. 


m | m—M | (m—M)? 

1911 299 11 121 
1912 283 5 25 
1913 350 38 1,444 
1914 232. 56 3,136 
1915 199 89 7,921 
1916 242 46 2,116 
1917 301 13 169 
1918 471 183 33,489 
1919 Qe 15 225 
1920 229 59 3,481 

2879 525127 


The total number killed (i.e., passengers, employees, etc.) 
varies a little from year to year in the neighborhood of 10,000. 
Assuming n= 10,000 we obtain as the probability that a person 
killed in any year will be a passenger 


287.9 a 
Slee = V288(0.9712) = 16.7. 
P= F999 0: 0288 and os ( 
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The dispersion computed directly from the observations is 


o=V5212.7=72.2+10.8. 


On comparison between the two values of the dispersion, 
we conclude that the relative number of passengers killed from 
year to year is very unstable. 

The values of the dispersion found above should not be 
regarded as reliable, because the inequality of the sizes of the 
sets was ignored. This fault will be considered in the next 
section. 


EXERCISES 


Ignore the possible effects of different sizes of sets of observations 
and investigate the normality of the following series, computing the 
probable error of the dispersion in each case: 

1. The number of business concerns, out of 122, which reported 
deficits to the Federal Reserve Bank of New York were as follows: 


1919 5 
1920 9 o=11.14 
1921 34 
1922 18 op= 3.78 


2. The number of deaths from accidents in coal mines in France, 
omitting the results of one very disastrous catastrophe, were as follows: 


1901 218 1906 163 
1902 196 1907 198 
1903 184 1908 171 
1904 193 1909 210 
1905 187 1910 194 


Assume the total number of miners to be 180,000. 

3. The following data represent the number of children born in 
Sweden, adjusted to a stationary population of 5,000,000 in accordance 
with a method to be explained in the next section: 


The data of Ixercises 2-5 are from Fisher’s ‘‘ Mathematical Theory of Probabilities.’’ 


1881 
1882 
1883 
1884 
1885 
1886 
1887 
1888 
1889 
1890 


145,230 
146,630 
144,320 
149,360 
146,600 
148,270 
148,020 
143,680 
138,300 
139,600 
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1891 
1892 
1893 
1894 
1895 
1896 
1897 
1898 
1899 
1900 


141,070 
134,830 
136,540 
134,840 
136,820 
135,330 
132,750 
134,820 
131,320 
134,460 


4. The following data give the number 
adjusted to a population of 2,500,000: 


1888 
1889 
1890 
1891 
1892 
1893 
1894 
1895 
1896 


17,605 
17,622 
17,181 
17,017 
17,012 
17,676 
17,445 
17,736 
18,239 


1897 
1898 
1899 
1900 
1901 
1902 
1903 
1904 
1905 


18,676 
18,870 
18,661 
19,015 
17,870 
17712 
17,791 
17,895 
17,947 
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M =140, 140 


o=5,718 


73 =369.0 


of marriages in Denmark, 


M =18035 


O=5)88. 46) Cp —l3e,8) 


5. The number of still-births in Denmark (freed from certain 
secular fluctuations in accordance with a method discussed later) 


were as follows: 


1888 
1889 
1890 
1891 
1892 
1893 
1894 
1895 
1896 
1897 
1898 
1899 
1900 


1754 
1826 
1741 
1699 
1740 
1726 
1666 
1708 
1678 
1784 
7769) 
1728 
1696 


1901 
1902 
1903 
1904 
1905 
1906 
1907 
1908 
1909 
1910 
1911 
1912 


1741 
1712 
1712 
1718 
1730 
1675 
1875 
1765 
1745 
1747 
1756 
1745 


M =1735 


o =37.09 


The average annual number of births is 70,000. 


op=41.06 
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6. Rough estimates of the number of deaths in New York City, 
adjusted to a stationary population of 2,000,000, were made as follows: 


1805 52,600 1865 63,400 
1815 49,400 1875 55,200 
1825 51,600 1885 53,600 
1835. 59,600 1895 46,200 
1845 60,600 1905 38,000 
1855 71,200 1915 30,600 


70. Basic Factors and Adjusted Series—One of the 
requirements of the theory, as considered previously, which 
can rarely be met absolutely in practice, but which is easily 
met in problems connected with games of chance, is that the 
number of observations in each of the sets shall be equal. We 
ignored this requirement in the example considered in the 
preceding section (and in the examples given in the exercises) 
to simplify the explanation of the application, and also because 
the character of the series considered was so strongly indicated 
that the discrepancies in the fulfilment of this requirement 
would clearly have no significant effect upon the final con- 
clusions. Let us consider the proper mode of procedure 
when the fulfilment of the requirement is of greater relative 
importance. 

If we find that a certain event happens m; times in n, trials, 
we conclude that the best estimate we can make of the proba- 
bility of the occurrence of the event from those observations 
is pi=mi/n; If a number of sets of trials are made and the 
number of trials varies from set to set, the various values of 
m; may not be at all comparable. If, however, the values of 
the bases n; vary only a little from each other, it may prove 
satisfactory to establish a common base, say n. Thus, if we 
multiply both the numerator and denominator of p; by n/n; 
we obtain 


ae mi 
Peet ld eae 
Wee nN; 4 
1 EEF —— 
n n 
hi— 
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We shall call the factor n/n; the basic factor; it is to be em- 
ployed to adjust the various frequencies m1, mz, etc., to a 
common base n when that factor differs very little from unity. 
The new series so obtained will be known as the adjusted series. 
As an example, the following data give the number of marriages 
in thousands in the United States, the corresponding basic 
factors for a common base n=730 (thousand), and the number 
of divorces (in thousands) for the specified years. 


Marriages Basic Divorces 

Year (thousands), Factors, (thousands), 

ny n/n4 mi 
1896 614 1.19 43 
1897 622 i, Wa 45 
1898 627 1.16 48 
1899 651 clean 51 
1900 685 il On 56 
1901 GN 1.02 61 
1902 747 0.98 61 
1903 786 0.93 65 
1904 781 0.94 66 
1905 805 0.91 68 


The second column of the following table gives the adjusted 
series m’; or the number of divorces multiplied by the cor- 
responding basic factor. 


Year m’ (m! —57) (m' — 57)? 
1896 51 —6 36 
1897 De —4 16 
1898 56 —1 1 
1899 ou 0 0 
1900 60 3 9 
1901 62 5 P55 
1902 55) —2 4 
1903 59 2 4 
1904 60 3 9 
1905 61 4 16 
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Therefore, the mean is at 
M=57.4, 


o2 = 12—0.16=11.84, 
and 
o=3.44+0.52. 


The empirical probability of a divorce from a marriage is 


M 57.4 = 
Pen = 7 
and 
q=0.921. 


Hence, the Bernoullian dispersion is 
op= Vnpg= V57.4(0.921) =7.27. 


The given series is then clearly subnormal, and we conclude 
that no significant disturbing influences have affected the data 
from year to year. 


EXERCISES 


1. The number of twins was found, for each of five genealogical 
records, to be as follows: 


Number of anne 
Births 
Ih. 4184 11 
2. 4116 32 
3. 4147 11 
4. 3421 25 
5. 4744 8 
Adjust the series and test for normality. Ans. ¢=1142.3 


op=4 


2. The total number of persons killed on railroads in the United 
States was as follows: 


1911 10,396 1916 10,001 
1912 10,585 1917 10,087 
1913 10,964 1918 9,286 
1914 10,302 1919 6,978 


1915 8,621 1920 6,958 


STATISTICAL SERIES 197 


Adjust the number of passengers killed, given in the text of the 
preceding section, to a base of 10,000, and test the normality of the 
series so obtained. 

3 and 4. The following distributions give (3) the population and 
number of deaths due to cancer among females in the registration 
states of 1900, and (4) the total number of deaths occurring in New 
York City, and the number of deaths in hospitals in New York City. 
Adjust the two series and test the adjusted series for normality. 


Cae Cie oe 

Total Cancer Total Deaths in 

Population Deaths Deaths Hospitals 
1906 11,169,000 10,290 76,203 19,163 
1907 11,365,000 10,870 79,205 21,444 
1908 11,562,000 11,290 73,072 20,684 
1909 11,758,000 tall 74,105 21,451 
1910 11,954,000 12,398 76,742 22,631 
1911 12,151,000 12,634 75,423 23,466 
1912 12,347,000 13,350 73,008 22,198 
1913 12,544,000 13,880 73,902 22,788 
1914 12,740,000 14,052 74,803 23,823 
1915 12,936,000 14,472 76,193 25,095 


71. Statistical Series Whose Basic Factors Would Vary 
too Widely from Unity.—The use of basic factors to adjust a 
series of frequencies for purposes of investigation, of the kind 
considered in the preceding section, is justified only when such 
factors differ little from unity. It scarcely needs to be said that 
statistics collected from various sources are only too likely 
to be based upon individual investigations which will differ 
quite widely from each other in the number of cases investi- 
gated. Suppose, for example, that we were investigating the 
trend of mortality due to a certain disease throughout many 
countries from year to year, and we wished particularly to 
determine whether the mortality statistics from the various 
countries were comparable or not; it is obvious that the basic 
factors of two countries, one of which has a population, say, 
one hundred times that of the other, would differ too widely, 
and that the use of such factors would be equivalent to weight- 
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ing the statistics of the smaller country one hundred times as 
much as the statistics of the larger country. As a matter of 
fact, the mean error of the adjusted statistics of the smaller 
country would probably be ten times that of the adjusted 
statistics of the larger country. (Explain.) On the other 
hand, if we should weight the terms of such a statistical series 
according to the relative values of their precision (1.e., pro- 
portional to the square root of their bases), we should thereby 
assign a character of stability to the statistics with the larger 
bases which would probably be unjustified. 

The method which we shall employ will be simply stated 
without. any demonstrations. It is based upon the two 


relations 
Nn. ;— 
oB= A] 55,,¥ 0d sa cw Se a eek 
and 


o=y/ A.D. ( 571.2533), ere 


of which the first was derived for the specific purpose of 
analyzing the types of distributions under consideration, the 
second is a relation between the dispersion and average devia- 
tion given in a previous exercise (Exercise 8, p. 178), and both 
relations are supposed to hold only when the given distribution 
is Bernoullian. Since, however, the average deviation gives no 
special emphasis to extreme deviations, the variations in its 
value in distributions which are not Bernoullian should not 
only be in the direction in which we should expect the dispersion 
to vary, but should reflect to a minimum extent the incon- 
sistencies introduced in comparing frequencies with widely 
different bases. We shall therefore employ the second relation 
to estimate the dispersion which would be computed directly 
from the statistics if the number of observations in each set 


5 Fisher, p. 159. 
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were the same. It is well to note in that connection that the 
average deviation of such a series would be 


n 
2n; | —m,—np = 
ane nN; n2 | — ne | 
ates mn, rn, ; 


and that for either relation 


Relation (59) is then to be employed to determine what 
would correspond to the hypothetical dispersion. 

It should be noted that the application of the two relations 
given above does not require that the terms of a distribution 
or series be adjusted. As an example, the number of auto- 
mobiles and the number of automobile fatalities in ten states 
in 1920 were as follows: 


Number of a 
fomoviles! Fatalities 
Ni mM; nip | Mj —Nyp | 

CaliLonnianemen eetnts 681,000 734 593 141 
Tim Os tes et esc a wr ates 663,000 728 577 151 
hate beRVs os oAa co sene 400,000 248 348 100 
emule kay men eres 127,000 84 110 25 
IMichiganwerirer rset 476,000 419 415 4 
IMUTUNETOR sooo 50006 325,000 178 283 105 
IN GROWER on. onpoacd 346,000 231 301 70 
INebraskaeac se 239,000 104 208 104 
Ohio meeer sen oer 723,000 “ile 630 87 
Tennessee........-- 117,000 130 102 28 

=n; =4,097,000 mim; =3574 815 
Hence, 

3574 

ee =) 0009 
4,097,000 : 
1000 815 
A.D. SB I, 


~ 4,097,000 
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and 
o=1.2533 XA. D.=0.249+0.037. 
Also 
10 1000 
73 4 097,000 1000(0.0009) (0.9991), 
or 


op=0.047. 


We therefore conclude that the ratio of fatalities to the 
number of automobiles varies considerably from state to state. 
As a matter of fact, the divergence between the values of the 
dispersions would probably have been much greater if we had 
included the data for states such as New York, New Jersey, 
etc., where a large part of the automobile traffic is urban. 


EXERCISES 


Test the normality of the following series: 

1. The following data give the number of wives and the number of 
childless wives for different periods, according to statistics collected 
from 22 genealogical records of American families: 


Total Childless 
1750-99 1966 37 
1800-49 5530 225 
1850-69 3062 181 
1870-79 1086 88 


2. The following data give the total number of accidents resulting 
in permanent disability and the total number of fatal accidents, for 
various countries. 


Permanent 

Disability eo 
AUUStIi aie CLSO7— O06): cerera cates art elena 82,446 8,349 
Belgaum (L90G— O08) Siena seein ne see 8,204 1,838 
Denniark (1899-1906). 0c... . cence wees ees 4,192 389 
ehipWoyerss AGRE ESIMIO SS Go aan boewoao abo aue 140,877 18,708 
Germanys (8991908) si sie sis cern tere 313,219 59,893 
Italy GSI SNA) APE Peerage iowa aie ook 9,701 2,224 
Norway ULS95 = 005)ie sce apache oeeeeene 4,496 832 
Russia mG O04= 1906) cn anaemia 34,981 2,345 


598,116 84,578 
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72. The Lexian Ratio and the Charlier Coefficient.—It 
has been found from experience, and it would be only natural 
to suspect that a great majority of statistical series are hyper- 
normal or Lexian, or that the ultimate effect of the change 
of the basic probability during an investigation is less than the 
ultimate effect of the change of the probability from investi- 
gation to investigation. This fact constitutes the main 
reason why we should never be too hasty in placing absolute 
faith in the value of an important probability which has been 
determined empirically. Any plan, then, for testing the 
reliability and comparability of a set of ratios or probabilities, 
like the one which we have been considering in this chapter, 
deserves considerable attention and _ possible extension. 
Results obtained by the plan considered here may be expressed 
better and more concisely by two formulas. The simpler 
one is the ratio 


bea ee en Ob) 


which is called the Lerian ratio. It is evident that a series is 
normal, hypernormal or subnormal according as L=1, L>1 
or L <1 respectively. 
As the expression 
= (pi—p)? 
N p 
marks the essential difference between the dispersions of a 
normal and a hypernormal series, it seems only natural to 
employ it as a measure of the variations in the chances from the 
mean p. Since, however, it is dependent on the absolute 
values of these chances, we divide it by the square of the mean 
of these chances. Denoting the quotient by p? we have 


pee) 
Re 


We next replace the numerator by its value taken from the 
formula for the dispersion of a Lexian series where, however, 
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we replace c, by o, since the latter is to be computed directly 
from the given observations. We have then 
o°—az 


; 
po -——* 
Neglecting n in comparison with n?, and remembering that 
M,=np, we have as an approximation 
Vea 
array” Spas 
100p is called the Charlier coefficient of disturbancy of a series. 
It is evident that the Charlier coefficient is zero for a normal 
series, imaginary for a subnormal series and real for a hyper- 
normal series. 
It is easily verified that the Lexian ratio for the series of 
divorces considered previously is 
3.44 
7.27 
and that the Charlier coefficient is 


V/11.84—52.89 
57.4 ; 


(62) 


L= =0.473 


100p = 100 
or imaginary. 
EXERCISES 


1. Compute the Charlier coefficient for the following distributions, 
giving the number of deaths from accidents in coal mines in various 
countries: 


England Germany z hate Belgium Austria France Japan 
1901 1224 1170 1982 164 81 218 263 
1902 1116 995 2263 150 1a 196 188 
1903 1134 960 1952 160 50 184 278 
1904 1116 900 21385 130 62 193 239 
1905 1215 930 2214 127 99 187 354 
1906 1161 985 2944 133 70 1262 578 
1907 1179 1240 2977 144 73 198 399 
1908 1188 1355 2220 150 58 Nat 262 
1909 1287 1021 2440 133 73 210 667 
1910 1530 985 2391 133 63 194 245 


Data of Ixercises 1-2 from Fisher's ‘‘ Mathematical Theory of Probabilities.” 
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The total number of miners in each of the countries during the 
designated period was approximately: England, 900,000; Germany, 
500,000; United States, 610,000; Belgium, 140,000; Austria, 68,000; 
France, 180,000; Japan, 110,000. 

Answers: * 

England, 9.15; Germany, 13.05; United States, 14.20; 
Belgium, 2.51; Austria, 13.84; France, 106.51; Japan, 42.92. 

2. The Courrieres mine disaster, in 1906, accounted for 1099 of the 
deaths tabulated above, for France. Eliminate the results of this 
disaster and show that the remaining statistical series is subnormal. 


73. Series with Certain Secular Fluctuations.—So far we 
have given our attention solely to the detection of disturbing 
influences in a statistical series. There may, however, be 
certain well-known influences in a given series, whose effects 
are already fairly well appreciated, which we should like to 
eliminate sufficiently to permit us to investigate the presence 
of other but disturbing and possibly vitiating influences. 
Thus, the investigation of a series giving the number of deaths 
by months in a community, due to a disease which occurs 
more frequently at certain times of the year than at others, 
would, of course, verify the presence of violently disturbing 
influences. The fluctuations in such a case would be periodic 
and might well conceal for the time the presence of other and 
undesirable influences. To take another illustration, it is 
fairly well established that the death rate due to cancer is on 
the increase. Hence, a series giving the number of such 
deaths in a community by successive years would, of course, 
prove to be hypernormal if the increase in death rate proved 
to be significant, and again the fluctuations due to this increase 
might well conceal the presence of other and undesirable 
influences. It will be necessary to omit consideration of 
methods of eliminating periodic fluctuations, since such con- 
sideration would necessitate the inclusion of methods of deter- 
mining the period of such fluctuations. We shall give our 
attention, then, to the second type of fluctuations and shall 

* Taken from Mr. Arne Fisher’s Mathematical Theory of Probabilities. Mr. Fisher 
has called our attention to the fact that his original answers, as reprinted in the First 


Edition of the present book, were erroneous and should be replaced by the answers 
given above.—John Wiley & Sons, Inc. 
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refer to them as secular fluctuations. In fact, we shall limit 
our discussion to secular fluctuations due to influences which 
tend to work uniformly in the same direction—that is, to give 
either a constant increase or a constant decrease in the series 
of frequencies to be considered. We propose first, then, to 
derive a method of measuring the rate of such a secular fluctu- 
ation, and then to show how these fluctuations may be elimi- 
nated sufficiently to allow an investigation of the presence 
of other and disturbing influences. It is evident that a series 
showing, say, an apparent increase in the death rate due to a 
certain disease may, after the secular fluctuations due to this 
increase in death rate are eliminated, prove so unreliable as 
to practically vitiate any satisfactory conclusions that might 
otherwise be drawn from the series taken as a whole. 

We shall assume that the fundamental probability varies 
by a constant difference * from one set of observations to 
another, so that 

Pi=P-1 +k, 


Pi=pit @—I)k. 
It is easily verified that the average p of N such probabilities 


and 


is 


N-1 
Pp=Pit 5 k, 


Boe eee. 


Now, if we assume the observed terms m1, mo, ... my of a 
given statistical series to be essentially the same as the values 
which would be given by the corresponding probabilities 
Pi, P2,. ++ Py, or the same as npi, Npo, . . . Npy where we assume 
the number of observations necessary to determine any observed 


term to be always the same or n, the relation just obtained may 
be written 


and that 


mM = (iF Nk ee ate: 


where M denotes the mean or average of the observed terms. 
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The average variation or fluctuation from one value of m, 
to the next is evidently 


nk = 5 {(m2—m) +(ns —ms) + ++ + (my—mMmy-1)} 
my — 
ae m 


Thus, given a column of values of 7(=1, 2,...N) anda 
column of corresponding values of m;, we need only to subtract 
the first value of m; from the last value and divide the difference 
by the total number of values diminished by unity to obtain nk. 

To remove the secular fluctuations we need, then, to set 
up a new column of values of m; obtained by solving relation 
(A) for M or 


M=m,— (1. ) nk, ee ace en (G2) 
and letting7=1, 2, . . . N successively. We shall refer to this 


final series as the residual series. 

If the number of observations is not the same for all sets, 
the series will have to be adjusted in accordance with the 
schemes suggested previously. As an illustration, let us con- 
sider the following distribution, which gives the number (m) 
of deaths from cancer in the City of New York adjusted to a 
basic population of n=1,000,000. The last column gives the 
residual series. 


Year a m m 
1904 1 609 664 
1905 2 639 681 
1906 3 619 649 
1907 4 658 676 
1908 5 6381 637 
1909 6 683 677 
1910 a 710 698 
1911 8 710 686 
1912 9 all 685 
1913 10 718 670 
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It is easily verified that the adjusted distribution or series 
is hypernormal. 

By formula (B): 
__718—609 


nk 9 =12.1. 
If we substitute this value in formula (62), and let 7=1, 
2,... 10 successively, we obtain the residual series given to the 


right above. 
It is now easily verified that the new series is subnormal, 
or that the Lexian ratio is 


L=14 (approximately), 


and that the Charlier coefficient is imaginary. 


EXERCISES 


Remove the secular fluctations in the following statistical series 
and then test the residual series for normality: 
1. The following data give the number of still-births in Denmark 


corresponding to an assumed stationary total number of births of 
70,000: 


Year Year Year Year Year 

1888 1861 1893 1788 1898 1797 1903 1685 1908 1694 
1889 1924 1894 1719 1899 17387 1904 1682 1909 1665 
1890 18380 1895 17538 1900 1696 1905 1705 1910 1658 
1891 1779 1896 1714 1901 17382 1906 1602 1911 1658 
1892 1811 1897 1811 1902 1694 1907 1723 1912 1638 


Ans. nk=—8.92; o=87.09; og=41.6; and the Charlier coefficient 
is imaginary. 

2. The following table gives the number of deaths from accidents 
in coal mines, in the United States, in which less than five men were 
killed. Assume the total number of miners to be 630,000: 


1900 1843 1905 1964 1910 2085 
1901 1863 1906 2075 1911 1984 
1902 1837 1907 2190 1912 1839 
1903 1768 1908 1967 1913 1957 
1904 1911 1909 2053 1914 1810 


Ans. C. C.=5.51 


SERIES WITH CERTAIN SECULAR FLUCTUATIONS 207 


3. The following data give the number of deaths from cancer in 
New York City, as reduced to a stationary population of 1,000,000: 
1889 377 1894 423 1899 513 1904 609 1909 683 
1890 476 1895 442 1900 547 1905 639 1910 710 
1891 410 1896 493 1901 595 1906 619 1911 710 


1892 444 1897 505 1902 540 1907 658 1912 721 
1893 462 1898 515 1903 580 1908 631 1913 718 


Ans. Residual series is slightly subnormal. 


4, The following data give the number of deaths among members 
of the Brotherhood of Locomotive Firemen and Engineers: 


Members Deaths 
1904 54,434 453 
1905 55,287 496 
1906 58,849 461 
1907 62,916 581 
1908 66,408 436 
1909 65,315 411 
1910 73,469 519 
1911 79,942 22 
1912 85,292 558 


Data of Exercises 1-3 from Fisher’s ‘‘ Mathematical Theory of Probabilities.’ 


CHAPTER X 
CORRELATION THEORY 


74. Introduction—Suppose that the average person were 
asked whether sons of fathers who lived to advanced ages 
also tended, in the long run, to live to advanced ages, ete. 
The ordinary procedure would be to try to recall actual 
examples. If the experience consisted merely of a few pairs 
of fathers and sons both of whom, in each case, lived to 
advanced ages, the answer would probably be in the affirma- 
tive. If the experience consisted of a few cases showing the 
opposite tendency, the answer would probably be in the 
negative. In neither case, however, would the answer be 
conclusive, because the experience would be entirely too 
limited. Correlation theory is highly useful in such a problem, 
because it enables one to assimilate and weigh any amount of 
experience, however large, in a single application, and hence 
to give what is generally accepted as a fairly conclusive answer. 
A measure of this correlation will often serve to establish 
either a connection between two phenomena which had been 
suggested only by statistical data and whose nature might 
still be unknown, or the independence of two phenomena 
which had been regarded hitherto as related in some way. 
Correlation theory has proved useful in almost every con- 
ceivable field, including biology, psychology, education, ete., 
but particularly in problems of heredity, and has helped to 
indicate what characters are inherited and what characters 
are due to peculiar environment. | 

Correlation theory is very useful in suggesting causal 
relationship between characters. Thus, if a high death rate ! 


"The student should be cautioned from the outset about using rates 
208 
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due to one disease in a community is almost invariably accom- 
panied by a high death rate due to a second disease, a causal 
relationship is suggested, furnishing a problem for medical 
authorities whose solution might lead to the discovery of 
causes hitherto unsuspected. It should be emphasized, how- 
ever, that the responsibility for the final explanation in such 
an investigation rests not with the statistician but with the 
authority versed in the particular field. 

Two characters are said to be correlated when with a 
selected value of one, certain values of the other are likely to be 
associated. Stated more precisely, two characters are said 
to be correlated if to a selected series of values of the one there 
correspond values of the other whose mean values are functions 
of the selected values. The full meaning of this statement will 
be fully appreciated and understood when we have entered 
more fully into the theory. 

75. The Correlation Surface.—Let the probability of the 
occurrence of a deviation or error x occurring be 


Ne Oh teeth 
—€ 20%, 
onV 2 


and the probability of the occurrence of a deviation y be the 
same expression with y substituted for x, where x and y are 
used also as subscripts of the corresponding standard deviations 
for purposes of distinction. If, and only if, the two prob- 
abilities are independent, the probability of their joint occur- 
rence is 


i! —3(2 32 
= e (Gite). Weare Oo) 


O70 y* 20 


which may be regarded as the equation of a surface. It is 
easily verified that all sections of this surface parallel to the 
xz-plane would be normal curves having the same standard 


(generally known as “‘indices”’) themselves in correlation problems, lest 
his results prove to be spurious. See Art. 81. 
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deviation (c,) and all sections parallel to the yz-plane would 
be normal and would have the same standard deviation (cy). 

It should be evident that the equation of the surface 
representing the probability of the joint occurrence of the 
deviations x and y, when the latter are not necessarily inde- 
pendent, would have to be of a more general form than that 
given above and one which would have to reduce to that 
given above as a special case. It has been shown elsewhere ? 
that certain very reasonable assumptions lead to the form 


gake~Mertemtm) |, (64) 


wherein the essential difference from the preceding form is the 
presence of the zxy-term in the exponent. The complete 
presentation of the derivation of equation (64) would carry 
us too far from our present purpose. However, if we seek 
merely a more general form of (63) it seems only natural to 
assume form (64). Why is it unnecessary to include x—, y— 
and constant terms in the exponent of (64)? We shall now 
proceed to express the coefficients a, b, h and k in terms of 
characters which can be readily computed. 

Let us recall for a moment certain features connected with 
the determinations of areas under curves and volumes under 
surfaces by the calculus. In determining the area under a 
curve by integration, the integrand represents the typical 
ordinate or strip of area of infinitesimal width whose area is 
summed. Likewise, if we seek to determine the volume under 
a surface by double integration, the integrand of the first 
integration represents as before the typical ordinate and the 
integrand of the second integration represents the typical 
section of infinitesimal thickness of the surface. Hence, if 
we integrate (64) with respect to 2 from — oo to + 0 we obtain 
the typical y section or distribution. Performing this integra- 
tion, we obtain 


* See Elderton’s “Frequency Curves and Correlation,” p. 109. 
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fo.) by2 ny? 2h n2y2 
be weneann sn ea per SCE ME) f° oA Pa 


(see equation (34)) =| —e-2 ae) =hy/e ol, 


and for this typical section or distribution 


Similarly, integrating the same expression with respect to 
y, we obtain 


If we let 
h 
SS ° . Om aoe A 
a (A) 
we obtain 
= =a), 
oz 
or 
: (B) 
a en Suara 
Similarly 
1 
OF ea : (C) 
Hence 
—r 
i Se a eames A ° 5 A (D) 


If we complete the double integration begun above (say, 
with respect to x) we obtain the volume or JN, or 


N=k fn ae ide rN [P= -oyV 20 
__ Ike, 


—=2rko,0 Vi—r, 
Va ; 


or 


<n ee) 
Qno.yV 1—r 
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Substituting the values of a, h, b and k just found in (64), 
the equation becomes 
N 1 (4-2+4) 


2= —___——e_ 2-) oF Foy oF ; 
Qro.yV 1-1? 


(65) 


The expression just found for z with N=1 represents the 
probability of the joint occurrence of the deviations x and y 
whether they be dependent or independent. The expression 
for z; represents the probability of the joint occurrence of x 
and y only when they are independent. The difference between 
the two expressions depends solely upon r; in fact, for r=0 
the expression for z reduces to that for z; All this suggests 
the advisability of using values of 7 as measures of the depend- 
ence or correlation of the two characters. The surface corre- 
sponding to equation (65) is called the correlation surface and r 
is called the correlation coefficient. In the next section we shall 
derive a method for computing the value of the correlation 
coefficient. 

76. The Product Moment and the Formula for the Corre- 
lation Coefficient—Just as the ordinate at the mean or the 
centroid vertical of a plane curve is the ordinate about which 
the first moment is zero, the centroid vertical of a surface is 
the vertical line about which the first moments of both the z 
and y distributions are zero (i.e., the vertical line passes 
through the center of gravity). 

If we define what is generally called the product moment of 
a surface z=f(a, y) by the relation 


Ley = f i cyf(x, y)dxdy, 


valued between appropriate limits, then the product moment 
of the correlation surface about the centroid vertical is 


‘00 co 
Lay = Bf if rye Har+2hey+by) dada 
—= 00) —0o 


ive) ie.) 
=e f yay Te Mae ee 
Us aN 
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But the integration with respect to 2 may be written 


LF (Gone 2 
aff e~ Mar-+2hay +b) (_ ay —hy+hy)da 
=) 


{oo} oO 
= tf e~ Hashana (— aar— hyde — Mf e~ 1ax?-+ 2hay + dv?) Jap 
OS = 0 Cees 


It is left for the student to show that the value of the first 
integral is zero. Moreover, the value of the second integral 


ee 2 
has already been found (verify) to be ee aa Hence, the 
a 


value of the product moment of the correlation surface reduces 


to 
0) 2 
Bay =k f u(-) (, [Fem 23 Jay 


It is easily verified that 


ie‘) 2 win 
f ye Boz dy = o3V 2r. 
(Integrate by parts, letting u=y) 
Hence, = a 
—hkV 202 2 
Lary = = é 
ava 


If we substitute the values of a, h and k in this expression, 
we obtain 
Lay=Noz0,", 
whence 
eee, oe che (66) 


r= 5 
Noxty 
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Explain how Sry may mean the same as Lzyf(z, y); com- 
pare the distinction between the two with the distinction 
between the ordinary arithmetic and the weighted arithmetic 
averages. 

77. The Unit Product Moment. Notation.—At this point 
it is well to recall the distinction made previously between a 
frequency surface and the corresponding histogram, and to 
keep in mind that the various moments of a histogram are 
only approximately equal to the corresponding moments 
of the corresponding curve or surface, the various moments 
of the histogram being found usually by ordinary summa- 
tion or simple addition, and the moments of the curve or 
surface being usually found by means of the calculus. Thus, 
LYayf(x, y) (or Dxy) for a histogram means that each value of 
f(z, y) is to be multiplied by the corresponding values of x 
and y (class marks) and the sum of all such products found. 
For example, in the correlation table given in Art. 79, f(a, y) 
for x= —1 and y= —2 has the value 11 and for that frequency 
cyf(x, y) =(—1)(—2)11=22. The sum of all such products 
is called the product moment of the histogram. However, 
the value of such a product moment would not be the value 
of the product moment about the centroid vertical. Just as 
it was found very inconvenient in frequency distributions of 
two dimensions to compute the values of the moments directly 
from deviations from the mean, and more convenient to employ 
a trial mean and make corrections later, so in the case of the 
product moment it will prove more convenient to employ a 
trial centroid vertical and employ corrections to give the 
value of the product moment about the true centroid vertical. 
The formula for making these corrections will be derived in 
the next section. 

If we designate the wnit product moment about the centroid 
vertical of a surface by uy (and about any other vertical line 
by u's) formula (66) becomes 

> 


po ee ED 
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The unit product moment about the centroid vertical 
of a histogram is denoted by vz, (and about any other vertical 
line by v’,,). 

78. The Standard Method of Computing the Product 
Moment about the Centroid Vertical—We shall now establish 
the relation between the product moment of a surface about 
the centroid vertical and the product moment about any other 
vertical line. 

Regarding this case as analogous to that of two dimensions, 
let 2=J7(@, y) be any value of z=f(x, y), x and y the corre- 
sponding coordinates with respect to the centroid vertical, and 
x’ and y’ the corresponding coordinates with respect to any 
other vertical line taken as the z-axis. Let the component 
distances of the centroid vertical from the arbitrary 2-axis 
be handk. Then, we have 


g=xth or x=2'—h, 
y=ytk or y=y'—k. 
Hence, if we write f(z, y) after the bracket for sake of 


brevity—where, however, it is to be included in the sum- 
mation 


1 
tay = EDF aly — ba! — Ey +ARZSCC, 9) 


(Da’y’ —k=(a+h) —hz(y+k) +hkz)f(a, y) 


=| 


=a La’y’— slba—hry—hkz)f(a, y)- 


But Daf (x, y) =Zyf(z, y)=0. (Why?) 


Also Sf(x,y)=N. (Why?) and hk=f(z, y) =hkN. 
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Therefore, 
p= k= y'ey—hh, . . . + (68) 


where y’z, is the unit product moment of a surface about any 
vertical line and h and k are, for purposes of computation, the 
first unit moments of the x and y distributions respectively. 
Strictly speaking, the symbol « would be appropriate only 
when the summation denoted by » is performed by integration. 
However, as quadrature formulas are almost never used in 
computing the value of the correlation coefficient, a confusion 
of symbols in such a connection is not serious. Formula (67) 
can now be written in the more workable form 


_ Hy —hk 


= (69) 


Ox0y 


79. The Computation of the Correlation Coefficient—We 
shall now show by illustration the computation of the corre- 
lation coefficient where the only feature which is essentially new 
is the computation of the product moment. The numbers 
given in the table itself record the number of students in the 
freshman class of a certain college having the corresponding 
weights and abdominal measurements designated. Thus, the 
“13” in the third row from the bottom refers to 13 students 
who weighed approximately 110 pounds and had abdominal 
measurements of approximately 24 inches. It is desired to 
compute the value of the correlation coefficient for these two 
characteristics. 

First, the rows are summed horizontally to give the 
y-distribution and the columns vertically to give the 2-distri- 
bution. Then the various moments of these distributions are 
computed exactly in accordance with methods previously 
considered. The product moment is computed in each of two 
ways for purposes of check—by rows and by columns. For 
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example, the number 268 (in the last column) = — 2{13(—3)+ 


42(—2)+11(—1)+1(0)}. 


—412 
N= iER 
» _ [966 _ 
2 2a N55 
69 
k= a5 
7) a gf RC 
7 uN 455 


The entire computation is given? 


= --0.905, 


(—0.905)2=1.142, 


= —0.363. 


—(—0.363)?=1.201, 


Wts. Abdominal Measurements (inches) 
(bs) 2459265285 30-32" 34" 36 38) 40) z y ye Ye xy 
230 2 A 12 72 48 
215 1 iL 5 25 25 
200 1 i 2 #4 8 32 24 
185 ee Lae eee EL 5S 8 15 45 30 
170 i iy & 2 23° 2 46 92 32 
155 iil 248) ils} ail 541 54 54 4 
140 1058538 2 135 0 
125 i GA iss aul 162 —1 —162 162 215 
iO) ese ala ak 67 —2 —134 268 268 
95 1 2 Sa) = 8) Al 
80 1 1-4-—- 4 16 8 
z ss her MSY) PP SAS —169 793 675 
Goo —2 —1 OF f 2°83 4 5 
zz —45 234 196 22 10 9 12 10/—412 

N =455 
wz 135 468 196 22 20 27 48 50) 966 
sy 90 312 97 30 20 21 60 45; 675 


3 Crelle’s (multiplication) 
square roots, reciprocals, ete., will be found very useful in computing the 


value of a correlation coefficient. 


tables and Barlow’s tables of squares, 
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pate 


v w= gop — 1.484, 
Yay = 1.484— (—0.905)(— 0.363) = 1.153, 
" ek =0.843. 


~ (1. 142) (1.201) 


The formula for the probable error of the correlation coeffi- 
cient is 
+0.6745(1—r?) 


ee 
VW N 


which, in this case, has the value +0.022. 
The means of the two distributions are 


M,= 30+ 2(—0.905)= 28.190, 
M,=140+15(—0.363) = 134.76. 


The true values of the standard deviation should also take 
consideration of the unit of measurement. The true values 
would then be 


o,=2(1.142)=2.284 and o,=15(1.201)=18.015. 


80. The Units of Measurement.—It will surely be noticed 
that the original units of measurement were completely ignored 
in the computation of the value of the correlation coefficient 
in the preceding section; the units of measurement, which were 
2 and 15, were changed at once to unity, and the origin of each 
distribution was translated to the class mark nearest the mean, 
as determined by inspection. It is permissible, in computing 
the value of the correlation coefficient, to ignore the original 
units of measurement, because if they were retained they 
would cancel in the final result; for, suppose that the unit of 
measurement in the x direction were m and in the y direction 
n; then h, obtained by assuming the unit of measurement 
to be unity, would actually be mh, and y’2 would be m?2v'9; 
therefore v2 would be m?v'2—m2h2 and o, would be maz. 
Likewise oy would be noy. The product moment y’,, would be 
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mney aNd pry=p'ry—hk would be mny’.y—mnhk= MNbey: 
Therefore 
r= MN Mery = Mey 


MNGoy Fzx0y 


which shows that the value of the correlation coefficient is 
independent of the units of measurement mand n. It should 
be emphasized, however, that this statement holds merely 
for the correlation coefficient; in computing the standard 
deviation or in locating the mean of a distribution, the unit 
of measurement is essential. Thus, the mean of the abdominal 
measurements is 30—2(0.905) =28.190 and the standard devia- 
tion is 2(1.142) = 2.284. 

It will be shown later that the correlation coefficient may 
have any value between —1 and 1. Zero signifies no correla- 
tion, and 1 perfect positive correlation; that is, when r=1, 
deviations of either characteristic are definitely associated 
with like deviations—in size and sign—of the other. Perfect 
negative correlation signifies a definite association of devia- 
tions but of opposite sense. It should perhaps be added 
that the best way for the student to become familiar with the 
kinds of data which are appropriate for a problem in correla- 
tion and with the method of arranging such data in a table 
would be to undertake an original problem. 


EXERCISES 
Find the value of the correlation coefficient and its probable error 
for the following tables: 
1. Heights in inches and weights in pounds of Glasgow school-boys, 
ages 4.5 to 5.5 years. (From Biometrika, Vol. XI.) 


Height 26 31 36 41 46 ~« 51 
31 2 

34 Bearts 5 

37 Ja 1S ene 8 
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2. Use the correlation table of grades in mathematics and by 
psychological test given in Art. 40. 


Ans. M;z=149.98 oz=—19-8 
M,= 74.28 oy=10.8 r=0.37+0.03 


3. Heights and weights of Glasgow school-boys, ages 13.5 to 14.5 
years. 


Height Weight 
46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 
43 1 


46 Mow ge Al 

AOR eat ay Open tweeted 

52 Pigs toe AN wane soto Rte ol 

55 eck PALI 49 2 ae ae ae 

58 1 2 19 25e28 165 7 3 

61 1. i ae Do <a 

64 1 2 1 1 1 


4. Left cubits (mm.) and left middle finger (mm.) Cairo-born 
Egyptians. (Biom. Vol. XII.) 


mm. Cubits 
395 410 425 440 455 470 485 500 515 5380 545 
94.5 1 
98.5 1 1 1 1 
102.5 it 9 17 4 1 
106.5 6 33 £69 13 
110.5 4 J&- "62" 975 11 6 
114.5 1 29 100 68 17 
118.5 1 aot aye Pal 5 
122.6 Aiea eo 11 2 
126.5 5 19 8 4 
130.5 1 2 2 
1384.5 2 1 


5. Relationship between size of annual income and per cent of 
total annual expenditure for food—from an investigation into the 
standard of living in the District of Columbia. 
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Per Cent 

22 eer oO 4 OSA? 46 505458 
$600 1 2 3 4 4 1 1 
800 1 if 5 5 9 12 10 1 
1000 3 9 ili 6 4 5 
1200 1 2 5 SS 18 2, 2, 1 
1400} 2 4 8 6 8 6 1 
1600} 2 3 4 8 3 2 3 
1800 1 2 1 1 1 
2000 3 
2200 1 


81. Regression. Linear Regression.—We shall refer to the 
columns (or rows) of frequencies of a correlation table cor- 
responding to a particular value of x as a y-array and to the 
rows (or columns) of frequencies corresponding to a particular 
value of y as x-arrays. Then, if we plot the means of the 
various y-arrays of an extensive table, these means will lie 
approximately on a smooth curve. The means can then be 
fitted by a curve by any convenient method such as that of 
least squares. The curve thus obtained is called the curve of 
regression (from the original application which showed the 
tendency of individual traits to ‘ regress”? or conform back 
to those of the general population), and if this curve is a 
straight line the regression is said to be linear. As the regres- 
sion is usually linear we shall consider this case only. 

The immediate problem is to determine the values of m 
and b for which the line y=ma+b will fit the means of the 
y-arrays. The problem can be solved for each particular 
correlation table, but it is possible and preferable to express 
m and 6 once for all in terms of familiar moments which can 
be readily computed or read off from computations already 
made in any particular problem. If we employ the method 
of least squares we are to determine the values of m and b 
for which 

=(¥Y—max—b)? 
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is a minimum, where y is the mean of the y’s of any y-array 
corresponding to the class mark a, and x and y are deviations 
from the means of the corresponding distributions. Differ- 
entiating with respect to b and m, we have respectively 
—25 Y—ma—b)=0, or Ly =mZe+2b. . (F) 
—2>2(y—mz—b)=0, or Ley=mzz?+bzzr. . (G) 
If we take the origin at the centroid 
Ly=22=0. (Why?) 
Therefore, by (F) 
=b=Nb=0, or b=0, 
and by (G) 
Lary = mz, 
or 


=m 


N N 


i ee as 


Now, for any y-array, x is constant and Sxy can be written 
x2y, which has the same value as «Ly (Why?) or =xy, and since 
the values of Yxy and Yzxy are the same for each y-array they 
are the same for the entire correlation table. Hence, xy or 
xy is the product moment and (H) may be written 


Vey=™Ma?, 
whence 
Vv oO 
m=—t=r—, 
Oz Ox 


and the equation of the line of regression may be written 


ey 
yr. < cew sigh 9 Siner mee aioe ametT 


Similarly, the line of regression which fits the means of the 
x-arrays is 


ear 
=e Sy 
Oy 
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which, of course, is not in general equivalent to the equation 
(Wale 


The expressions r = and r 2 are called regression coefficients. 
x y 
It is very important to note that although the value of the 
correlation coefficient is independent of the original units of 
measurement, the values of the standard deviation and, hence, 
of the regression coefficients are not. The value of the regres- 
sion coefficient computed, assuming the units of measurement 
to be unity instead of, say, m and n, is easily adjusted by 
multiplying by n/m (or m/n). 

82. Interpretation of the Regression and Correlation 
Coefficients.—The regression coefficient is highly useful not 
only for purposes of statistical measurement but also as pro- 
viding the best interpretation of the correlation coefficient. 
If we regard, for the moment, the arithmetic average of a set 
of variates as the most probable value of the variates, a selected 
value of x need only be multiplied by the value of the regres- 
sion coefficient to determine the most probable value of y to be 
associated with it. As the value of the regression coefficient 


7” in the problem considered in a preceding section has the 
oO 


value pet (0-843) =6.649, the most probable value of y 
to be associated with a selected value of « is 6.649”. Thus, an 
abdominal measurement of 34 inches corresponds to a deviation 
from the mean of such measurements of 34.000—28.190 or 
5.810. Hence, the most probable deviation from the mean of 
the weights to be associated with the deviation 5.810 is 
6.649(5.810) or 38.63 which corresponds to a weight of 
134.76+38 .63=173.39 pounds. 
If the equation of the line of regression be written 


it is evident that if we think of the deviations x and y as 


224 CORRELATION THEORY 


expressed in standard units by dividing by the value of the 
standard deviation, the regression coefficient becomes identi- 
cally the correlation coefficient and the interpretation of the 
regression coefficient given above applies to the correlation 
coefficient. 


EXERCISES 


1. Find the psychological grade which corresponds to a mathemati- 
cal grade of 70. Ans. 147. 

2. Find the mathematical grade which corresponds to a psycho- 
logical grade of 135. 

3. Find the abdominal measurement which corresponds to a weight 
of 180 pounds. 

4, Find the weight which corresponds to an abdominal measure- 
ment of 40 inches. 


83. Upper and Lower Bounds of the Correlation Coefficient. 
—The sum of the squares of the residuals found in a correlation 
table by taking the difference between the observed deviation 
y and the corresponding theoretical value as given by the 
equation of the line of regression (71) is 


A 2 
Coe \s : o o2 
(yr 22) = Dy? — 2r “Tay+r?d2? 
Ox Or a2 
= Negi 97229 200 Noe 
= No? —2r-—No.0,+1°-No2 
Ox Or 


=No2(1-r?), 


and since the left side must be positive, so must the right side, 
which requires that r?<1 or that r be not greater than +1 
or less than —1. If r?=1 all the points corresponding to the 
frequencies lie on the regression lines and the lines coincide. 
(Why?) In other words, if the value of the deviation of one 
character is given when r= +1, the value of the corresponding 
deviation of the other character is uniquely determined. 
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84.4 Correlation between n Variates—We have already 
discussed correlation between two characters or variates and 
explained the method of computing the correlation coefficent 
and of obtaining the equation of the line of regression. It is 
sometimes desirable to measure the correlation between several 
variates. Thus, it may be illuminating to measure the corre- 
lation between the index prices of foods with other factors of 
our social life, or between characteristics not only of a father 
and his son but also of the son and both parents, and even of 
the brothers and sisters and of grandparents, and to determine 
a linear relation corresponding to the line of regression in the 
case of two variates. The regression coefficients would at 
least indicate the relative influence of the several factors. The 
formulas necessary for treating n variates, which correspond to 
the formulas derived previously, will now be given but without 
derivation. The application of the formulas will involve no 
new principles except possibly a few elementary ones involving 
determinants. 

The normal correlation function of which (65) is a special 
case where n=2 has the form 


z= mes oi 1 eT 


o102... OnV S(2r)" 


where S is the determinant 


1 12 aaa Tin 
T21 1 123 T2n 
T31 732 1 66 & UR am ee Sn (73) 
Tri Tn2 Tn3 i! 


and Sis the co-factor of ri; (=7;:) or the determinant obtained 


4This article may well be omitted in an elementary course. Some 
general explanation would, however, be desirable to explain the use of 
formula (75) to avoid spurious results. 
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by striking out the row and column passing through rj and 
giving it the proper sign. The various correlation coefficients 
in (73) are computed in the ordinary way but are called partial 
correlation coefficients because in computing their values the 
other variates are regarded for the time as constant. 

In order to consider cases where n is greater than 2, we 
introduce the symbols 


te 

= . . . . . . . - A 

ee ice a 
a 

Ros =e 74 

: V8iSu ie 


R,; is called the multiple correlation coefficient between the 
characters (i.e., deviations) x; and 2; of the nth order where 
there are n characters involved. It is easily verified that Rj, 
is identically the correlation coefficient r when n=2. Likewise, 
when n=3, 

PLS = Poasres 


Rio= é 
V (1—r2,)(1—72,) 


(75) 


Referring again to the case of n characters and following 
the same line of procedure to obtain formula (71), the following 
regression equation of the first degree, for determining the 
most probable value of x; to be associated with selected values 
Of 29, We... . Ca, IS ODbamed: 


= Rie wet+Ris—aa+ ...+Rin ta, . . (76) 
a2 a3 An 

which may be written 

1 =byo%2+bi3a3+ oes + DinFans at tet cd (77) 
where 

ake 
hia Re 
1 ey Fi Does ee 


It should be recalled that in computing the values of the 
partial correlation coefficients 71; the original units of measure- 
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ment may be ignored, but that they must be retained in 
expressing the values of the standard deviations as they appear 
in formula (78). 

The labor involved in computing the multiple correlation 
coefficients proves excessive if expansions of formula (74) are 
used. Formulas,® called recursion formulas, have been derived 
for expressing the value of a multiple correlation coefficient 
in terms of those of the next lower order but the easiest method, 
when the order is greater than 3, is to valuate the various 
co-factors of determinant (73) by successive expansion in 
accordance with Laplace’s development, after zeros have 
been made to appear in the first column (or row). The method 
is given in any textbook on elementary determinant theory 
and will not be reproduced here. If Crelle’s (multiplication) 
tables are used, a determinant of the sixth order involving 
partial correlation coefficients to three decimals can, with 
a little practice, be valuated in about twenty minutes, and 
determinants of lower order in less time. 

As a numerical illustration, the values of the standard 
deviations and correlation coefficients between the mean 
temperatures of the months of (1) July, (2) June, ... (5) March, 
computed for Lund, Sweden, by Charlier, are as follows: 


o1=1.91 

o2=1.99 112 =0.734 

o3=1.99 713=0.518 723 = 0.586 

o4=1.80 114=0.361 1r24=0.409 134=0.429 

o5=2.68 115=0.146 1re5=0.209 7r35=0.303 145=0.421 


5 Yule’s recursion formula is 


ne Ray —Rarlox 
V (1—R%,)(1—-R#) 


Ravcave serene) 


where Ray =Rapcave...n) 
Raz =Raxcac... nk) 


Rox = Rox eve ... Ak) 
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Substituting the values of these correlation coefficients in 
the determinant (73), the values of the co-factors Si: @=1, 


2,...5) were found according to the method outlined above 
to be 
Siu= 0.410 
Sio=—0.264 whence bi2= 0.617 
Si3= —0.053 bisz= 0.125 
Si4= —0.027 big= 0.070 
Sis= 0.024 bis = —0.041 
Since 
S=Siitri2Sie ... +7isSis, 
or 
— Drs 
ae er 


the value of the expression on the right was computed, as a 
check, to be 1.0006 which is satisfactory, considering that the 
values of the correlation coefficients were computed to only 
three decimals. 

The regression equation (77) giving the most probable value 
of the mean temperature in the month of (1) July may now be 
written 

x, =0.617x2+0.12523 +0.07024—0.041 25, 


where x2 represents the average temperature in the month of 
June of a selected year diminished by the mean temperature 
obtained from a large number of years for this month; that is, 
x2 is a selected deviation from the mean. The quantities 
x3, t4, and v5 have analogous meanings relating to the months 
of May, April and March, respectively. The quantity 2x; 
represents the most probable deviation of the temperature of 
July from the mean temperature of that month. Similar 
regression equations could be obtained from the original 
determinant S for the other months. 

It should be added that for a complete solution of the problem 
of general forecasting of temperature all other meteorological 
factors should be included in the investigation. 
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85. Spurious Correlation.—There is one type of correlation 
problem whose characteristics should be carefully noted and 
which can be treated in only one way to avoid “ spurious ”’ 
results. It is the type whose data consist of ‘“ indices” or 
whose pairs of measurements have been obtained by dividing 
by a common divisor. The frequencies of such a correlation 
table may show correlation which is due partly or wholly to 
the effect of the common divisor. A complete treatment of 
the type can not be given here and is unnecessary, for a single 
example can be given which will suffice to justify our contention 
and which will illustrate clearly the possible effect of the 
presence of such common divisors. If the following hypo- 
thetical data were arranged in the usual form of a correlation 
table: 


x y f(x, y) x y S(z, y) 
40 30 8 50 40 2 
20 30 2 40 50 2 
30 30 4 30 20 2 
50 30 4 40 20 4 
60 30 2 50 20 2 
30 40 2 40 10 2 
40 40 4 


the table would appear as follows: 


20 30 40 50 60 


50 2 
40 2 4 2 
y 
30 | 2 d 8 4 2 
20 2 4 2 
10 2 


It is evident from the appearance of the table that the data 
were selected with the direct purpose of showing no correlation ; 


230 CORRELATION THEORY 


for, the values of the product moment and, therefore, of the 
correlation coefficient would be zero. But suppose that each 
pair of measurements were divided by some number, say, 
according to the following plan: 


Divisors & y f(z, y) Newz New y 
10 40 30 8 4 3 
5 20 30 2 4 6 
6 30 30 4 5 5 
10 50 30 4 5 3 
6 60 30 2 10 5 
10 30 40 2 3 4 
20 40 40 4 2 2 
5 50 40 2 10 8 
10 40 50 2 4 5 
5 30 20 2 6 4 
20 40 20 4 2 1 
5 50 20 2 10 4 
10 40 10 2 +t 1 
40 


the new measurements and frequencies, when arranged in a 
correlation table, would appear as follows: 


geo 3 a OR 7 eae ah 


y=s 2 
‘i 
6 2 
5 2 4 2 
1 2 2 2 
3 8 4 
2/4 
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It is obvious, without any attempt at computation, that 
there is now considerable correlation. It is evident, however, 
that the correlation is due entirely to the use of the common 
divisors. 

As a concrete illustration of a problem which would probably 
lead to spurious correlation, suppose that we were investigating 
statistical evidences for the relationship between the effects 
of cancer and of diabetes, as indicated by the comparison 
of the death rates due to these diseases in a large number of 
communities. We wish to know whether, in the long run, the 
size of the death rate due to one disease is associated with 
particular sizes of death rates due to the other disease. 
Suppose that we ascertain the pair of death rates for each of a 
large number of communities and find that the number of 
communities showing a particular pair of death rates appears 
as a frequency in the correlation table and the deviations from 
the mean of the two death rates or their class marks as values 
of candy. It is obvious that to determine the two death rates 
for a community the number of deaths for both diseases must 
be divided by the population of the community, and that the 
value of the correlation coefficient is very apt to prove spurious. 

Spurious results in such problems may be avoided by treating 
the divisors as values of a third variate and using formula (75) 
for multiple correlation. Thus, in the problem concerning 
the correlation between cancer and diabetes it is necessary 
to determine the values of the three correlation coefficients 
between the three characters, “ deaths (not death rates) due 
to cancer,” “deaths due to diabetes” and “ population ”’ 
taken in pairs. The values of these three coefficients should 
then be substituted in formula (75) to give the desired value 
of the (multiple) correlation coefficient. 

86. One Criticism of the Use of the Correlation Coefficient. 
Suggestions for Further Study.—Students who are sufficiently 
interested in the analysis of statistics to continue the study 
further should be informed of one defect in the use of the 
correlation coefficient which may prove important in some 
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investigations. The means of a set of arrays may lie approxi- 
mately on a curve which is not well covered by the preceding 
discussion, and there may be much greater correlation in such 
a case than the value of the correlation coefficient would indi- 
cate; in fact, it is easy to formulate hypothetical situations 
in which the correlation is perfect but the value of the corre- 
lation coefficient differs quite significantly from unity. Such 
situations occur very rarely in practice, and it was deemed 
inadvisable to include a treatment of them in this volume. 
It may be sufficient to say that what is called the correlation 
ratio has been found to give better measures of the correlation 
under such circumstances. 

The student is advised to extend his study also to the 
derivations of the various formulas for probable errors which 
he will find in Biometrika, the Drapers Research Memoirs 
and other journals, besides the textbooks on statistics by Yule, 
Bowley, Fisher, Elderton, Jones, ete. Considerable informa- 
tion is contained in the preface of Pearson’s ‘ Tables for 
Statisticians.” 

The systematic fitting of frequency curves together with 
suitable quadrature formulas is very important, particularly 
the remarkable system of curves known as the Pearson 
frequency curves, a brief account of which may be found 
either in Elderton’s “ Frequency Curves and Correlation,” 
or Jones’ ‘‘ First Course in Statistics,” as well as in Pearson’s 
“Tables.” The work of statisticians of northern Europe 
should not be omitted in that connection, but unfortunately 
a reading knowledge of the languages of that part of Europe is 
necessary for the complete study of that work. Arne Fisher’s 
“Mathematical Theory of Probabilities’ constitutes the best 
and almost the only account of that work in English and 
includes also a treatment of many other interesting and 
fundamental problems, particularly those connected directly 
with probabilities. A brief account is given there also of a 
subject in the field of statistics which is little appreciated as 
yet in this country—that of invariants. 
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.87 |1741310|743019|744732|746449|748170| 749894) 751623|753356)755092)756833 
.88 |758578|760326]762079|763836)765597|767361|769130)770903]772681)774462 

89 1776247|778037|779830]781628]783430]785236|787046/788860|7 90679792501 


.90 |794328]/796159/797995|799834|801678)803526)805378|807235}809096|810961 
.91 |812831/814704/816582/818465/820352/822243/824138/826038/827942/829851 
.92 |831764/833681/835603|837529/839460/841395|843335|/845279)/847227|849180 
.93 |851138/853100]855067|857038/859014|860994/862979/864968/866962/868960 
.94 |!870964|872971/874984|877001/879023)/881049)883080/885 116/887 156/889201 
.95 |891251/893305|895365|897429) 899498) 90157 1/903649)/905733|907821]909913 
.96 1912011]914113/916220)918333|920450)/922571)924698|/926830]928966/931108 
.97 |933254/935406/937562|/939723 |941890|944061)/946237/948418/950605/952796 
98 1954993/957194/959401|961612/963829|/966051/968278]970510]972747|974990 
.99 |977237|979490|981748/984011/986279|988553/990832/993 116/995405/997700 
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A Coefficient, regression, 223 
Adjusted (statistical) series, 194 of variation, 99 
Algebraic treatment of symbols, 29 Combinations, 65 
Arrays, 221 Component of a force, 104 
Artillery, field, 157 Computation, numerical, 5 
Asymmetrical curves, 167 Consistency of observations, 95 
Average, arithmetic (or mean), 79 Correlation (theory), 208 
weighted, 80 coefficient, 212 
deviation, 97 computation, 216 
geometric, 81 probable error of, 218 
weighted, 81 spurious, 209, 229 
harmonic (mean), 81 between n variates, 225 
median (quartiles, percentiles, etc.), ratio, 232 
81 surface, 209, 212 
mode, 85 Curve, fitting, 85 
by moments, 112 
B by least squares, 149 
Basic factors, 194 
Bernoulli numbers, 31 D 
series, 178 Determinants, 225, 227 
Beta function, 60 Deviation, 88 
average, 97 
C standard, 94 
Centroid, 102, 119 Difference, finite, 13 
vertical, 119 leading, 15 
Charlier check, 110 Dispersion, 94, 123 
coefficient of disturbancy, 202 hypothetical, 190 
Class marks, 83 Distributions, frequency, 83, 88 
limits, 83 normal, 93, 137 
interval, 84 239 
Coefficient, correlation, 212 
partial, 226 E 
multiple, 226 Errors, absolute and relative, 2 
of disturbancy, 202 compensating and accumulative, 3 
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Errors, extended meaning of, 90 
maximum, 156 
probable, 153 
maximum, 156 
standard, 156 
Expectation, mathematical, 76 
of life, 54 


F 
Factorial, 17-18 
Field artillery, 157 
Force, moment of, 103 
Frequency curves, 85 
Pearson, 139 
distributions, 83, 88 
surfaces, 88 
Functions, Beta, 30 
Gamma, 56 
rational, 27 
rational integral, 17 


G 
Gamma function, 56 
Gradulations of frequencies, 85, 118, 
114, 126, 141-145 
Guiding principle of probable error, 
161 


H 
Histogram, rectangular, 86, 88 
Homogeneity of populations, 70 
Hypothetical dispersion, 190 


I 
Indeterminate forms, 57 
Indices in correlation, 209, 229 
Integration, finite, 21 
by parts, 26 

by substitution, 57 

Interpolation by Newton’s formula, 
34 
by Lagrange’s formula, 36 


Interpolation by leading-difference 
formulas, 40, 44, 47 
tangential, 42 
of ordinates among areas, 45 
of areas, 48 
Interpretation of correlation coefh- 
cient, 224 
of regression coefficient, 223 


L 
Lagrange’s interpolation formula, 36 
Least squares, 148 
curve fitting, 149 
Lexian series, 178 
ratio, 201 


M 
Maximum error, 156 
Mean or arithmetic average, 79, 102 
harmonic, 81 
provisional or trial, 88 
Moments, definition, 109 
simple, 102 
curve fitting, 112 
unit, 117 
about mean, 119 
product, 125, 134, 212 
unit, 214 
of point binominal, 173 
of a force, 103 
summation method, 124, 129 
Mortality tables, abridged, 51 
Multiple correlation, 226 © 


N 
Newton’s formula, 19 
Normal curve, 136 
derivation of equation, 139 
graduations, 141-145 
tables of ordinates and areas, 145 


(subnormal, hypernormal) series, 
190 
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18 
Partial correlation coefficient, 226 
differentiation, 149 
Point binomial, 167 
application of, 173 
Poisson series, 178 
Probability, a priori, 64 
empirical or a posteriori, 70 
factors, 157 
Probable error in a single observation , 
153, 155 
maximum, 156 
of various quantities, 160 
Problems, some famous, 76 


Quadrature formulas, 118 


R 
Reason, cogent, 74 
insufficient, 74 
Recursion formulas, 227 
Regression, 221, 226 
coefficient, 223 
equation, 221 


Residual (statistical) series, 205 
Root-mean-square, 94, 123 


S 
Seasonal variation, 127 
Secular fluctuations of a statistical 
series, 203 
Series, statistical (Bernoullian, Pois- 
son, Lexian), 178 
normal, etc., 190 
adjusted, 194 
residual, 205 
Skewness of frequency curves, 167 
Standard deviation, 94, 123 
error, 156 
Statistical series, 178 
Summation of series, 22, 32 
method of computing moments, 
124, 129 


a 
Taylor’s expansion, 30 


V 
Variability, 99 
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